Enhanced Detection of DNA Viruses in the Cerebrospinal Fluid of Encephalitis Patients Using Metagenomic Next-Generation Sequencing

The long and expanding list of viral pathogens associated with causing encephalitis confounds current diagnostic procedures, and in up to 50% of cases, the etiology remains undetermined. Sequence-agnostic metagenomic next-generation sequencing (mNGS) obviates the need to specify targets in advance and thus has great potential in encephalitis diagnostics. However, the low relative abundance of viral nucleic acids in clinical specimens poses a significant challenge. Our protocol employs two novel techniques to selectively remove human material at two stages, significantly increasing the representation of viral material. Our bioinformatic workflow using open source protein- and nucleotide sequence-matching software balances sensitivity and specificity in diagnosing and characterizing any DNA viruses present. A panel of 12 cerebrospinal fluid (CSFs) from encephalitis cases was retrospectively interrogated by mNGS, with concordant results in seven of nine samples with a definitive DNA virus diagnosis, and a different herpesvirus was identified in the other two. In two samples with an inconclusive diagnosis, DNA viruses were detected and in a virus-negative sample, no viruses were detected. This assay has the potential to detect DNA virus infections in cases of encephalitis of unknown etiology and to improve the current screening tests by identifying new and emerging agents.

The long and expanding list of viral pathogens associated with causing encephalitis confounds current diagnostic procedures, and in up to 50% of cases, the etiology remains undetermined. Sequence-agnostic metagenomic next-generation sequencing (mNGS) obviates the need to specify targets in advance and thus has great potential in encephalitis diagnostics. However, the low relative abundance of viral nucleic acids in clinical specimens poses a significant challenge. Our protocol employs two novel techniques to selectively remove human material at two stages, significantly increasing the representation of viral material. Our bioinformatic workflow using open source protein-and nucleotide sequencematching software balances sensitivity and specificity in diagnosing and characterizing any DNA viruses present. A panel of 12 cerebrospinal fluid (CSFs) from encephalitis cases was retrospectively interrogated by mNGS, with concordant results in seven of nine samples with a definitive DNA virus diagnosis, and a different herpesvirus was identified in the other two. In two samples with an inconclusive diagnosis, DNA viruses were detected and in a virus-negative sample, no viruses were detected. This assay has the potential to detect DNA virus infections in cases of encephalitis of unknown etiology and to improve the current screening tests by identifying new and emerging agents.

INTRODUCTION
Encephalitis is a severe neurological syndrome defined by inflammation of the brain parenchyma in association with clinical evidence of neurological dysfunction (Tunkel et al., 2008). In Western countries, its annual incidence has been estimated to be 0.7-12.6 per 100,000 for adults and 10.5-13.8 per 100,000 for children (Jmor et al., 2008;Granerod et al., 2010c;Michael et al., 2010). Mortality rates range between 7 and 18%, and among the survivors, severe disability has been reported in up to 56% of the cases (Mailles and Stahl, 2009;Granerod et al., 2010a;Thakur et al., 2013). Encephalitis has multiple etiologies and pathogeneses. Viruses have been reported as the most common etiological agents, causing 20-50% of the encephalitis cases (Glaser et al., 2006;Granerod et al., 2010b;Ambrose et al., 2011). Immune-mediated etiology has been increasingly recognized as the second most common cause of the disease (Gable et al., 2009;Granerod et al., 2010a;Scheer and John, 2016). Strikingly, in more than 50% of cases, the etiology remains undetermined (Glaser et al., 2006;Florance et al., 2009;Gable et al., 2009;Venkatesan et al., 2013).
The "gold standard" diagnostic test is the pathologic examination and testing of brain tissue, however, this is rarely done ante-mortem due to potential morbidity associated with an invasive neurosurgical procedure. The most frequently used diagnostic procedures include PCR detection of causative pathogens in cerebrospinal fluid (CSF) and blood, serological testing for specific antibodies in blood and CSF, and occasionally pathogen culture (Solomon et al., 2007). Herpes simplex virus type 1 (HSV-1), varicellazoster virus (VZV), and any of a number of Enterovirus species are identified by CSF PCR in 90% of the cases where a viral pathogen is identified (Solomon et al., 2012). Other members of the Herpesviridae are commonly detected in encephalitis cases -HSV-2, Epstein-Barr virus (EBV), cytomegalovirus (CMV), and human herpesvirus types 6 and 7 (HHV-6 and -7) -in addition to viruses from diverse families including Adenoviridae, Paramyxoviridae, Orthomyxoviridae, Polyomaviridae, Rhabdoviridae, Parvoviridae, Astroviridae, Pneumoviridae, Retroviridae, several arboviruses from the Flaviviridae, Bunyaviridae, and Reoviridae, and both zoonotic and non-zoonotic members of the Togaviridae, and Arenaviridae (Palacios et al., 2008;Mailles and Stahl, 2009;Quan et al., 2010;Chan et al., 2014;Fok et al., 2015;Naccache et al., 2015;Haley and Atwood, 2017;Crawshaw et al., 2018;Mehta et al., 2018;Vidal et al., 2019). This list is not exhaustive.
Existing diagnostic methods, although somewhat successful for known viruses, are limited by their high specificity when employed to detect genetically divergent, unknown, or unexpected viruses that might be present in the sample. Together with the large and expanding number of pathogens reported to be capable of causing encephalitis (Granerod et al., 2010c;Gurav et al., 2010;Benjamin et al., 2011;Solomon et al., 2012;Woolhouse et al., 2012;Woolhouse and Adair, 2013;Fok et al., 2015;Hoffmann et al., 2015;Kennedy et al., 2017), it is perhaps unsurprising that so many cases have inconclusive etiology.
Metagenomics, the direct and sequence-agnostic analysis of all genetic material within a sample, coupled with the massively parallel sequencing capabilities of metagenomic next-generation sequencing (mNGS) represents a potential breakthrough in the diagnosis of encephalitis and has led to the discovery of a large number of novel and/or unexpected viral agents of disease (Tan et al., 2013;Phan et al., 2015;Kawada et al., 2016;Kang et al., 2017;Morfopoulou et al., 2017;Bukowska-Ośko et al., 2018;Oechslin et al., 2018;Piantadosi et al., 2018;Eibach et al., 2019;Wilson et al., 2019).
Nonetheless, viral mNGS is a challenging approach due the low relative abundance of virus-derived material in clinical specimens compared to host-derived material. Improving this ratio is key to achieving a sufficient amount of viral reads to allow reliable detection and accurate identification of viruses in a sample (Chan et al., 2014;Hall et al., 2014;Kohl et al., 2015;Lewandowska et al., 2015;Bukowska-Ośko et al., 2017). Selective depletion of the ribosomal RNA (rRNA) fraction followed by DNAse digestion resulted in a significant methodological improvement in the mNGS protocol for RNA viruses previously developed in our laboratory (Manso et al., 2017). However, effective enrichment of viral DNA has proven to be more challenging due to the lack of differential motifs between human and viral DNA that allow depleting the former without affecting the number of copies of the latter in the sample.
Here, we describe a DNA mNGS protocol focused on increasing the relative abundance of viral DNA at two stages: (i) before extraction, by performing selective lysis of mammalian cells with digitonin, a specific steroidal saponin used by researchers to manipulate cell membranes (Hannah et al., 1998;Jamur and Oliver, 2010), followed by DNAse digestion of host genomic DNA, and (ii) after generating metagenomic libraries, by size selection of library fragments. These two approaches notably improve the detection and characterization of DNA viruses in the Clinical Virology Multiplex I panel (CVM panel) and clinical CSF samples.

Ethics Statement
All experiments were performed in accordance with the "Guidance on Conducting Research in Public Health England" (Version 3, October 2015; Document code RD001A). This study involved the use of archived, residual samples that were collected as part of a prospective etiological study on encephalitis in the UK with approval from the North and East Devon Multicenter Research Ethics Committee (05/Q2102/22). The samples were anonymized by removal of any patient identifiable information and assignment of a non-specific project number prior to genetic characterization.

Clinical Virology Multiplex I Panel (CVM Panel)
A lyophilized reagent comprising 11 DNA viruses known to cause encephalitis was obtained from the National Institute of Biological Standards and Controls (Potters Bar, UK, catalog number 15/130-xxx). Prior to extraction, the reagent was resuspended in 1 ml of negative CSF sample. The following viruses were included in the panel: adenovirus serotype 2 (AdV-2), BK and JC polyomaviruses (BKPyV and JCPyV, respectively), HSV-1, HSV-2, CMV, EBV, VZV, HHV-6a and b, and Parvovirus B19 (PV B19). Further information as to their characteristics can be found on the NIBSC website 1 and in studies by Doris et al. (2015).

Clinical CSF Samples
A total of 12 CSF samples from patients suffering from acute viral encephalitis, previously characterized by routine diagnostic testing.

Digitonin-DNAse Treatment of CSF Samples
Plasma membranes of cells present in 200 μl CSF were permeabilized by adding digitonin (Sigma Aldrich, Poole, UK) to a final concentration of 25-100 μg/ml and incubating at 1 www.nibsc.org/documents/ifu/15-130-xxx.pdf Frontiers in Microbiology | www.frontiersin.org 37°C for 5 min, followed by the addition of 2 U of Turbo DNAse enzyme and Turbo DNAse buffer (both ThermoFisher, Dartford, UK) to a final concentration of 1X. Digests were incubated at 37°C for 10 min, followed by immediate extraction.

Nucleic Acid Extraction and Library Preparation
A total of 200 μl of either untreated or digitonin-DNAse-treated CSF was extracted using PureLink Viral RNA/DNA Mini Kit (Invitrogen, Renfrewshire, UK) following the manufacturer's specification but omitting carrier RNA. Concentrations of dsDNA in extracts were determined using the Quant-it dsDNA HS Assay Kit on a Qubit 3.0 fluorometer (both Invitrogen).
Sample extracts were diluted to 0.2 ng/μl where possible; extracts with lower DNA concentrations were used without dilution. DNA libraries were prepared from 5 μl DNA using the Nextera XT DNA library prep kit (Illumina, Cambridge, UK) according to the manufacturer's instructions.
The standard protocol for the clean-up of libraries used AMPure XP beads (Beckman Coulter, High Wycombe, UK) at the recommended 1.8X bead ratio. The effects of single and double clean-up steps and the use of 0.85X bead concentration were investigated. Following clean-up, libraries were analyzed for size distribution using the High Sensitivity DNA Kit on a 2100 Bioanalyzer Instrument (both Agilent, Stockport, UK) and were quantified using Qubit, as described above.
Batches of four libraries labeled with different indexes were pooled; within each pool, each component library contributed the same total mass. Pools were further quantified by Qubit, as described above and diluted to a final concentration of 2 nM before being denatured with 0.2 N sodium hydroxide for 2 min, diluted in kit reagent HT1 to produce a 20 pM solution and then further diluted to 7.9 pM. Of this library pool dilution, 600 μl was loaded onto a MiSeq cartridge. Sequencing was performed on an MiSeq instrument using the MiSeq Reagent Kit V2 (300 cycles; both Illumina) according to the manufacturer's guidelines.

Real-Time PCR
The relative abundance of human material in sample extracts was evaluated by real-time PCR using primer and probe sets targeting c-myc (Schroeder and Nitsche, 2010) and β-globin (Lo et al., 1998). Reactions were performed using the KAPA Probe Fast Universal Kit (Roche, Burgess Hill, UK) according to the manufacturer's instructions.

Data Analysis
Adapters and poor-quality terminal bases were removed from paired-end FASTQ files with Trimmomatic v0.39 (Bolger et al., 2014; RRID:SCR_011848); followed by removal of duplicates and low-complexity reads using PRINSEQ (Schmieder and Edwards, 2011;RRID: SCR_005454) with an entropy cut-off of 70 and all de-duplication options selected. Cleaned FASTQs were mapped with PALADIN (Westbrook et al., 2017) to a database comprising the RefSeq viral protein sequences downloaded from the /refseq/release/viral directory within the NCBI ftp repository (located at ftp://ftp/ncbi.nlm.nih.gov) supplemented with the NCBI RefSeq human protein sequences (/refseq/H_sapiens/mRNA_Prot). NCBI taxonomy files (/pub/taxonomy/taxdmp.zip) were used to map taxon IDs to each reference (Brister et al., 2015;RRID:SCR_003496). Hits were considered only for those mapping results having an e-score below 10 −10 . For each hit, taxon IDs for the mapped target itself and all of its parental taxonomic divisions were obtained by iteratively querying the nodes.dmp file of taxdmp.zip; counters were incremented for all taxon IDs common to both ends of paired-end reads. Outputs were limited to viruses within taxonomic divisions known to infect humans (Woolhouse et al., 2012;Woolhouse and Adair, 2013).
In the second analysis stage, cleaned FASTQs were mapped with BWA MEM (Li, 2013;RRID:SCR_010910) to RefSeq genomes of those viruses having over 1.5 reads per million assigned by PALADIN. To visualize detection of diverse genome fragments within a target virus, mapped reads were binned by the percentile within the genome length of their mapped starting position (the POS field in the SAM files).

Digitonin-DNAse Treatment Depletes Human DNA From CSF Extracts
The effect of digitonin-DNAse treatment on human DNA concentrations in nucleic acid extracts was investigated by treating a virus-negative CSF sample with 25, 50, 75, or 100 μg/ml digitonin followed by DNAse digestion. Real-time PCR against c-myc and β-globin showed greater than 99% reduction in human material in both cases, with Qubit spectrophotometry showing an approximate 90% reduction (Table 1) at a concentration greater than 50 μg/ml. Similar analysis of the CVM panel resuspended in a virus-negative CSF sample treated with 50 μg/wml digitonin and DNAse showed a reduction of 95-99% human material compared to controls. This protocol enhancement was applied to four CSF samples from patients diagnosed with viral encephalitis. In three of the four, a reduction in human material of up to 98% was obtained. In the fourth sample, no reduction was observed, although the initial concentration was very low.

Digitonin-DNAse in Combination With AMPure Size Selection Enhances Virus Detectability in CSF Extracts
Libraries prepared from duplicate extracts of both control and digitonin-DNAse treated CVM panel aliquots were sequenced on a single MiSeq run, yielding an average of 2.6 million paired-end reads per sample. With the exception of HSV-2, which showed a slight reduction in relative read count compared to the control, all viruses showed an increase ( Table 2). BKPyV was exceptional in showing an average 60x increase. VZV, JCPyV, and adenovirus (AdV) showed increases of 30x, 28x, and 15x, respectively. The remaining panel viruses showed 1.6-2.6x increases. Strikingly, PV B19 was undetectable in the control libraries, but three reads per million (rpm) were detected in the two libraries following digitonin-DNAse treatment.
The four libraries from the previous analysis were subjected to a second clean-up step at an AMPure bead ratio of 0.85X. This step selectively removes shorter fragments from the libraries (Figure 1). The rightmost columns of Table 2 show a modest effect of up to 1.3x on the detection of viral reads in the control samples and a 1.3-2.5x effect upon the digitonin-DNAse treatment libraries. With both enhancements combined, BKPyV showed the highest increase in read frequency (131x). VZV, JCPyV, and AdV showed 61x, 59x, and 32x increases, respectively, with the remaining herpesviruses showing a combined increase of 1.3-6.2x.

Application of the Enhanced Protocol to CSF Samples From Encephalitides
A series of 12 CSF samples from acute encephalitis patients was tested together with negative human plasma (NHP) and water controls, using the enhanced protocol incorporating the digitonin-DNAse and size selection modifications. With conventional diagnostics, a DNA virus etiology was established in nine of the patients and excluded in one ( Table 3). In samples from the remaining two patients (nos. 1 and 11), the presence of AdV and BKPyV (respectively) was provisional; the real-time PCR curves emerged at a cycle beyond the established limit of detection of the assay. Four samples were multiplexed on each MiSeq sequencing run, giving approximately 3-5 million reads per sample.
In seven of the diagnoses (five of VZV and two of JCPyV), the metagenomic analysis gave a strong corroborating signal with 16-161 rpm ( Table 3, nos. 2, 3, 6-9, and 12). A sixth VZV diagnosis (no. 4) gave only EBV hits by PALADIN at 16 rpm, mapping to a broad range of genomic regions, suggesting possible initial mis-diagnosis (Figure 2). Mapping this sample's FASTQ files to the VZV reference genome gave over 200 hits, but these were almost entirely targeting two short regions of VZV; only three regions in total had any mapping at all. Similarly, EBV was the only virus detected in a sample initially diagnosed with HHV-6 infection (no. 5 in Table 3), with 8.6 rpm detected by PALADIN. Sample no. 11 (one of the two samples with a late diagnostic PCR -BKPyV in this instance), gave 8.7 rpm for EBV. Both samples' read sets again mapped to diverse regions of the EBV genome, whereas in the former, only 11 reads mapped to the HHV-6b genome across nine percentiles. The latter sample also had a relatively high number (22 rpm) of torque teno virus (TTV) reads detected, also mapping across genome percentiles, although no higher resolution identification than the genus alphatorquevirus was possible. The second late-cycle diagnosis, for AdV (sample no. 1), gave a very high number of reads for human papillomavirus type 10 (HPV10), sufficient to assemble an entire genome (data not shown).
The final sample (no. 10) was positive for the β-glucan biomarker, suggesting the presence of a fungal pathogen (Lyons et al., 2015), and no viral targets were detected in this sample. Secondary AdV detections were made by PALADIN analysis in the two JCPyV-positive samples (nos. 7 and 8). However, reference mapping of these targets indicated that these were      false positives, with zero AdV-mapping reads in either sample. An additional PALADIN detection of EBV in sample 7 corresponded to a total of just eight EBV-mapping reads.
In the two controls, the water control gave few reads, of which none derived from a DNA virus, whereas HHV-6b was detected in the NHP control at a rate of 3.3 rpm, with 14 mapped reads across eight genome percentiles.

DISCUSSION
Effective treatment of many forms of encephalitis relies upon a prompt response, with delays often leading to devastating consequences. The high number of cases in which the etiological agent remain undiagnosed highlights the need for improved diagnostic methods. The use of unbiased, sensitive and cost-effective metagenomic NGS assays to sequencing the total RNA and DNA in a sample represents a potential breakthrough in the diagnosis of infectious encephalitis.
In this study, we present an mNGS protocol that allows enhanced detection and characterization of DNA viruses in CSF samples, overcoming the challenges of low target abundance through the use of digitonin-DNAse treatment and AMPure bead-based size-selection of library fragments. Up to 99% of the human DNA was removed by this method -more than methods exploiting differential methylation between host and viral genomic material (Feehery et al., 2013;Oyola et al., 2013;Thoendel et al., 2016). Concomitantly, virus read frequencies were enhanced by up to nearly 60-fold. These values compare favorably with conventional methods of viral enrichment, where the depletion of human material either led to only modest increases in viral read frequency (Hall et al., 2014), or improvements dependent upon either the virus and techniques used (Kohl et al., 2015), or high nucleic acid input quantities (Conceição-Neto et al., 2015;Parras-Moltó et al., 2018). This mirrors our experience with these physicochemical methods in that the recovery of some viruses can be enhanced, but this is invariably at the expense of others. Although other saponins have been successfully applied to pathogen metagenomics (Hasan et al., 2016), the variability of commercial saponin products reduces its potential for use in clinical applications.
A second enhancement followed the observation that in DNA libraries prepared from digitonin-DNAse treated samples, fragments of human origin had a size distribution considerably shorter than viral fragments (data not shown). We hypothesize that much of this material represents DNAse-hypersensitive mononuclosomes (Schwartz et al., 2019). Size-selection of libraries derived from both virus panels and clinical samples with a low ratio of AMPure beads resulted in a further increase in the frequencies of viral reads, as well as a greater representation of reads from diverse genomic regions of the detected viruses. The PV B19 genomes in the CVM panel gave low rpm values, and then only after digitonin treatment, perhaps reflecting a low copy number in the CVM panel and/or the single-stranded nature of its genome. AMPure bead-based size selection is routinely used to selectively remove adapter dimers from library preps, but other than a recent VIDISCA paper (Edridge et al., 2019), this is the first time it has been used to perform the enrichment of viral DNA fragments on library preps from CSF samples.
The enhanced mNGS workflow was challenged using a panel of CSF samples from patients suffering from encephalitis, previously diagnosed by routine diagnostic tests. Sequencing data were blindly analyzed, and achieved concordant results in seven of the nine samples with a definitive diagnosis. In the two discordant samples, mNGS clearly detected a different viral species within the Herpesviridae family from that originally diagnosed, both through PALADIN and through reference mapping. Unfortunately, retrospective confirmatory laboratory tests could not be performed owing to a lack of remaining sample, and the cause of the discrepancies remains unclear. Reads from both samples mapped to their originally diagnosed viruses (HHV-6 and VZV), but in both cases, the number and distribution of hits were both much lower than for those detected by PALADIN.
In one of two CSF samples with an unconfirmed diagnosis by routine testing, the presence of EBV and TTV was identified by mNGS; TTV has been recently detected in CSF samples from encephalitis patients, and EBV is a well-established cause (Kang et al., 2017;Eibach et al., 2019). In the second, mNGS was able to assemble a complete genome of HPV10, an alphapapillomavirus exclusively associated with cutaneous lesions (Cubie, 2013), and hence most likely to represent a skin flora contaminant arising from lumbar puncture. A final sample had a putative diagnosis of a fungal agent, and our mNGS assay detected no viral reads.
The impact of hits caused by read mis-assignment or reads from reagents and environmental contaminants was initially minimized by filtering the mapping results both by e-score and by limiting outputs to viruses within taxonomic divisions known to infect humans (Woolhouse et al., 2012;Woolhouse and Adair, 2013). Notwithstanding these filters, in two JCPyVpositive samples, a low number of AdV reads were detected. In one of the two, EBV was also detected. Secondary reference mapping analysis revealed the AdV detections to be false positives and the EBV attribution to be doubtful, owing to the low number of both hits and mapped reads, in contrast with the high values from the true positive JCPyV outputs. These data support the use of multiple bioinformatic tools with diverse algorithmic natures, a principle that has been repeatedly shown to improve the accuracy of metagenomic analyses (Lin and Liao, 2017;McIntyre et al., 2017). In a recent paper (Miller et al., 2019), a group from San Francisco proposed having at least three viral reads spanning at least three non-overlapping regions of the most closely matched reference sequence as a requirement to report a pathogen detected by metagenomics. While in the light of VZV data from sample 4 and the hits at up to 5 rpm in both samples and the NHP control we would advocate more stringent corroborating metrics within our metagenomic assays, the application of thresholds at a level enabling discrimination between true and false positive detections, while retaining a useful sensitivity remains largely empirical.
Hence, in all cases, formal diagnosis necessitates confirmation with pathogen-specific assays (Granerod et al., 2010b;Brown et al., 2018), although the diagnosis of HHV-6 in sample 5 could be dependent upon the testing algorithm. It should be noted, however, that demonstrating that a detected agent is causative can be problematic, particularly in cases where a novel agent is discovered (Tan et al., 2013;Phan et al., 2015Phan et al., , 2016Bukowska-Ośko et al., 2018;Eibach et al., 2019).
The level of agreement between our mNGS results and routine diagnostics compares favorably to those of other authors. For example, a similar Swiss study recently reported metagenomic analyses of six CSF samples with a DNA virus diagnosis, of which only one was reliably concordant with prior diagnostics. In the remainder, the signal-to-noise ratios were insufficient to consider the metagenomic information valid (Oechslin et al., 2018). In another recent study using the VIDISCA-NGS technique, CSF samples were tested in which the presence of herpesvirus had been previously diagnosed by routine qPCR test. Digestion of target material during DNase treatment presented a problem and as a result, virus was detected in just one of the DNAse treated samples. Less than 30% of the non-DNAse treated samples gave a signal, and only then in high viral load samples (Edridge et al., 2019).
In contrast, the San Francisco group reported a strong concordance between routine and mNGS results. The study evaluated the accuracy of a mNGS assay for detection of pathogens causing encephalitis, including 26 DNA virus positive and 19 DNA virus negative samples previously tested by qPCR assay, observing a 89.8% accuracy. This value increased to 92.4% when repeat testing of discrepant samples was performed (Miller et al., 2019).
To conclude, digitonin-DNAse treatment can effectively improve the ratio of viral to host DNA in CSF samples. The proportion of viral reads can be further improved by sizeselecting libraries prepared from digitonin-DNAse treated samples. The use of effective enrichment methods allows more samples to be multiplexed per sequencing run, thus reducing costs and making the mNGS approach more economical in the clinical setting. By applying only moderately advanced bioinformatic tools, the presence of DNA viruses can be successfully identified in the resulting mNGS datasets. Thus, in conjunction with a parallel RNA virus method (e.g., adapted from Manso et al., 2017), this proposed mNGS assay has the potential to help detect viral causative agents from the high number of encephalitis cases with unknown etiology and to be used as a second-line test to current target-specific assays. The increased accessibility of NGS technologies in clinical microbiology laboratories and the ever-decreasing costs of running these tests should make this a reality. Furthermore, mNGS will spur improvements in the current screening tests by identifying new and emerging etiological agents which could be later incorporated into the target-specific first-line tests.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
All experiments were performed in accordance with the "Guidance on Conducting Research in Public Health England" (Version 3, October 2015; Document code RD001A). This study involved the use of archived, residual samples that were collected as part of a prospective etiological study on encephalitis in the UK with approval from the North and East Devon Multicenter Research Ethics Committee (05/Q2102/22). The samples were anonymized by removal of any patient identifiable information and assignment of a non-specific project number prior to genetic characterization.

AUTHOR CONTRIBUTIONS
CM carried out research in the lab. DFB carried out the bioinformatic data analysis. HM contributed to lab research. MZ performed sample primary diagnostic and assembled the CSF clinical sample set. CM, JM, and DWGB conceived of the project. CM, DFB, and JM conceived of lab methods. JM supervised the project. DFB and CM wrote the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This work has been funded by the Clinical Virology Network (CVN) research grant program. CVN had no role in study design, data collection and interpretation, or the decision to submit the work for publication.