Cerebrospinal Fluid in a Small Cohort of Patients with Multiple Sclerosis Was Generally Free of Microbial DNA

Multiple sclerosis (MS) is a common cause of non-traumatic neurologic disability with high incidence in many developed countries. Although the etiology of the disease remains elusive, it is thought to entail genetic and environmental causes, and microbial pathogens have also been envisioned as contributors to the phenotype. We conducted a metagenomic survey in cerebrospinal fluid (CSF) from 28 MS patients and 15 patients suffering other type of neurological conditions. We detected bacterial reads in eight out of the 15 non-MS patients and in a single MS patient, at an abundance >1% of total classified reads. Two patients were of special interest: one non-MS patient harbored ~73% bacterial reads, while an MS patient had ~83% bacterial reads. In the former case, Veillonella parvula, a bacterium occasionally found associated with meningitis was the predominant species, whilst Kocuria flava, apparently an environmental bacterium, predominated in the latter case. Thirty-four out of 43 samples contained <1% bacterial reads, which we regard as cross- or environmental contamination. A few viral reads corresponding to Epstein-Barr virus, cytomegalovirus, and parvovirus were also identified. Our results suggest that CSF of MS patients is often (but not always) free of microbial DNA.


INTRODUCTION
Multiple sclerosis (MS) is a common inflammatory-demyelinating disease of the central nervous system (CNS), which leads to impaired sensory, motor, cerebellar, brain stem, and autonomic functions, and is the foremost cause of non-traumatic neurologic disability in young and middle age patients (Giesser, 2011).At the molecular level, demyelination involves deterioration of myelin sheaths that ensheathe axonal connection between neurons, primarily by macrophages (Comi, 2010).Because T cells mediate myelin degradation, MS is often referred to as an autoimmune disease (Piccio and Cross, 2011).Despite intense research, the cause of MS remains elusive, but the evidence accumulated so far points toward a multifactorial etiology, including genetic (Oksenberg, 2013;Cree, 2014;Sawcer et al., 2014;Axisa and Hafler, 2016) and environmental (O'Gorman et al., 2012;Ascherio, 2013;Malli et al., 2015) factors.In Caucasian cohorts, MHC haplotypes have been found strongly associated with MS (McFarland and Martin, 2007).Some environmental predisposing factors have also been identified, including vitamin D deficiency, smoking, and women are twice as susceptible as men, possibly due to a neuroprotective role of testosterone (Gold and Voskuhl, 2009).Although the disease is distributed worldwide, it holds a positive correlation with latitude, and foci of high incidence are rather patchy across the globe, suggesting that some ethnic groups are more susceptible than others (Ramagopalan and Sadovnick, 2011;Pantazou et al., 2015) or that locallyconcentrated infectious microbes may be associated with disease biogenesis (Comi, 2010;Libbey et al., 2014).Furthermore, in a large cohort of patients (n > 40,000) from Canada, Great Britain, Denmark, and Sweden, it was found that significantly fewer people suffering MS were born in November, while significantly more MS patients were born in May, which might point to seasonal oscillations in vitamin D or infectious agents, which in turn may interact with genetic factors since the effect was heightened in familial cases (Willer et al., 2005).
Cerebrospinal fluid (CSF) is a clear, non-viscous, material with a very low content of proteins, potassium, and glucose, when compared to plasma, but higher content of chloride and sodium, which maintains CSF neutral charge (Davson and Segal, 1996).It is believed to originate predominantly as a secretion of the choroid plexuses in the ventricles of the brain, upon filtering of plasma through choroid epithelial cells (Davson and Segal, 1996), although this hypothesis has been challenged and instead was proposed that CSF also originates in other compartments of the CSF system (Orešković and Klarica, 2016).CSF localizes to the brain ventricles and subarachnoid space and also surrounds the spinal cord (Davson and Segal, 1996).In addition to mechanical cushion or buoyancy for the cortex, CSF also provides immunological protection to the brain (Nathanson and Chun, 1989), chemical stability (Praetorius, 2007), among other functions.
The presence of immunoglobulins (IgG) in CSF in a large proportion of patients undergoing neurological disorders suggests that at least some of those diseases might have an infectious etiology (reviewed in Das Sarma, 2010).For instance, IgG with specificity against measles virus was found in patients suffering subacute sclerosing panencephalitis (Connolly et al., 1967;Connolly, 1968); similarly, antibodies against Cryptococcus have been found in CSF of meningitis patients (Porter et al., 1977).A wide range of microorganisms has been proposed as putative cause for MS (Johnson, 1994;Swanborg et al., 2003;Stuve et al., 2004;O'Gorman et al., 2012;Zawada, 2012;Libbey et al., 2014).Many studies have also reported the association of Herpes viruses and MS (Warren et al., 1977;Merelli et al., 1997;Moore and Wolfson, 2002;Kuusisto et al., 2008), and the gut microbiome of MS patients exhibit a different profile than that from paired control subjects (Chen et al., 2016;Jangi et al., 2016).It is not clear how antigen-presenting cells in the CNS can interact with activated lymphocytes in the periphery, but it has been proposed that the choroid plexus may display antigens in CSF, likely presented by astrocytes or glial cells, to peripheral blood cells through the choroid epithelium (Nathanson and Chun, 1989).
Since MS is a demyelinating disease, logic would suggest that potential pathogens influencing the disease should also be able to induce demyelination.Several species of bacteria have been reported invading the CNS (Casserly et al., 2007), including Mycoplasma pneumoniae (Abramovitz et al., 1987) and Chlamydia pneumoniae (Sriram et al., 1999).M. pneumoniae reportedly induces demyelination (Greenlee and Rose, 2000) but this finding has not been replicated by other investigators (Casserly et al., 2007;Lindsey and Patel, 2008).Another hypothesis emerging from animal models suggests that bacterial superantigens (e.g., Staphylococcus aureus enterotoxins A, B, C, D, and E) activate auto-reactive T cells, which then promotes the onset of immune diseases like MS (Brocke et al., 1993).Indeed, a significant proportion of patients with relapsing MS (within 30 days) were positive for S. aureus enterotoxin A as compared to subjects without MS (Mulvey et al., 2011).For viruses, several mechanisms of demyelination have been proposed, including viral lysis of infected oligodendrocytes or immune lysis of uninfected oligodendrocytes specifically or non-specifically triggered by the viral infection (reviewed in Libbey et al., 2014).While microbes are attractive candidates for triggering MS, the environmental triggers for disease have yet to be determined, reflecting limited sensitivity of techniques applied for microbial detection as well as concomitant problems of contamination of samples.Accordingly, we sought to analyze the composition of DNA circulating in CSF from patients affected by MS or other neurological conditions, using an unbiased metagenomics approach to identify possible microbial triggers of MS.Our main findings suggest that CSF is often a microbial DNA free environment, but occasionally can be loaded with bacterial DNA.Virus-like sequences were initially detected, but upon careful examination most sequences turned out to be spurious hits belonging to contaminating DNA.A limited fraction of the reads may correspond to bona fide herpesviruses and parvoviruses.

Most Multiple Sclerosis (MS) Samples Are Free of Bacterial DNA
We initially attempted to extract DNA from 78 CSF samples from patients affected by MS or other neurological conditions.However, only 30 of those samples produced detectable levels of DNA, and we used them for the construction of metagenomics libraries (samples are described in Table 1).We also used non-measurable amounts of DNA from non-MS patients, and could obtain libraries suitable for sequencing from 15 samples, but it required a higher number of PCR amplification cycles.Hereafter, samples derived from multiple sclerosis patients are denoted by the letters "MS" while the rest are designated as "non-MS" or "control" samples.In a previous study to assess sensitivity of metagenomics for detection of microbial DNA in CSF, it was shown that libraries from cell cultures spiked in with only viral nucleic acids may yield up to ∼4% of bacterial sequences, suggesting environmental contamination (Bukowska-Ośko et al., 2017).Using this frequency as a somewhat arbitrary threshold, we found that 4 out of 15 (∼27%) non-MS samples contained more than 5% of bacterial reads (Figure 1, Figure S1, and Table 1), while only a single MS sample out of 28 (∼3.6%) was found to harbor more than 5% of bacterial reads.Interestingly, all those samples correspond to female patients.The rest of libraries also had some bacterial reads, but at very low abundance, with 34 out of 43 libraries containing <1% bacterial reads (Tables 1, 2).Two patients are of special interest (3C and 26MS).Patient 3C and 26MS harbored ∼73 and 83% bacterial reads, respectively.The taxonomical bacterial profile of both samples  was significantly different, excluding the possibility of crosscontamination between them (Figure 1).The sample from patient 3C was mainly populated with reads that resemble Veillonella parvula, and to a lesser extent Streptococcus gordonii and S. sanguinis and few other species of lower abundance (Figure 1A).Sample from patient 26MS was overwhelmingly populated by three species of Kocuria (flava, palustris, and rhizophila) and a series of non-abundant bacteria (Figure 1B).To exclude the possibility that an abundant and highly redundant sequence was producing spurious alignments to one or few loci in the genome of those two overrepresented bacteria, we plotted the relative abundance of reads along the whole genome of V. parvula (Figure 2A) and Kocuria flava (Figure 2B).The relatively homogeneous distribution of reads along the genome of those two bacteria suggests that those bacterial genomes were present in the sequenced pool of DNA, and excludes the possibility of artifactual alignments.
To gain insights into the relationships between samples, we conducted principal component analysis (PCA; Jovel et al., 2016).PCA was conducted on any taxon that was detected in at least one sample (n = 559; Supplementary Table 1).The first Eigen vector of the PCA separated samples that contained more than 1% of bacterial reads from those that contained <1% (Figure 3A).This positioned samples 3C, 4C, 5C, 7C, 8C, 10C, 12C, and 14C on the left of the rest of control samples and somewhat proximal to each other (Figure 3A).The second Eigen vector of the PCA clearly separated sample 26MS from the rest of samples containing bacterial reads.Samples that contained <1% bacterial reads clustered together in the northeast quadrant of the plot, irrespectively of being MS or non-MS (Figure 3A).To refine our analysis, we conducted hierarchical clustering.In hierarchical clustering, an iterative algorithm positions samples with similar bacterial profiles closer on the branches of a dendrogram, while more dissimilar samples will branch apart (Jovel et al., 2016).The most ubiquitous bacterial species were Propionibacterium acnes and Alteromonas mediterranea, with the former bacterium previously found as contaminant in our libraries (Jovel et al., 2016).As deduced from the PCA analysis, all samples in the control group with more than 1% of bacterial reads clustered together in a single node (green node in upper dendrogram) with some minor subgroups (Figure 3B).For the samples in this cluster, the most-densely populated with bacterial reads (3C; red node on upper dendrogram; leftmost column in heatmap) might have acted as a contamination source, especially for samples that were sitting nearby in the same rack during library construction, since samples were ordered numerically on the rack.The fact that samples 4C, 5C, 7C, and 8C were also found to contain bacterial reads, supports the cross-contamination hypothesis.Accordingly, the profile of bacterial reads relative abundance depicted in the heatmap is similar, but not identical, between control samples than contained more than 1% of bacterial reads (Figure 3B).Noticeably, there are two conspicuous blocks of bacterial taxa that appear to be absent from most samples, but commonly represented in the control samples with more than 1% of bacterial reads.When such bacterial sets were zoomed in, a large number of bacterial species are evident, with the genera Streptococcus, Pseudomonas, and Bacteroides being among the most frequently detected (blue font in Figure 3C).The most abundant species in non-MS samples, V. parvula, appeared represented in five out of the eight control samples, but at high frequency only in sample 3C.As before, the MS sample harboring a large proportion of bacterial reads clearly separated from non-MS sample (blue node on upper dendrogram in Figure 3B).K. flava, the most abundant bacterial species in sample 26MS (73,796 reads) was found in some other samples (2MS, 18MS, 2C, 13C, 10MS, and 7C) but at a maximal abundance of 35 reads per library.Based on our previous experience, we regard that as cross-contamination.In addition to K. flava, sample 26MS contained a series of bacterial taxa of various abundances, which were only rarely seen in other samples (Figure 3B, cyan frame, Supplementary Table 1, colored red).In general, samples that are located close to each other on the wetlab rack can be considerably contaminated while trace cross-contamination may happen between all tubes, likely when DNA is aerosolized.

Spurious Viral Hits
In addition to phage sequences, the most prevalent virus detected was the Human immunodeficiency virus 1 (HIV-1) and the herpes viruses 5 and 7. Although, at very low abundance (1-16 reads per library) the high frequency of HIV-1 sequences was rather suspicious and prompted us to inspect such hits more carefully.We extracted the raw reads that aligned to all viruses, and remove those ones corresponding to phages hits.The rest was aligned with BLAST against the whole "nt" database of NCBI.As expected, the HIV-1 hits were spurious since better hits either to lentiviral vectors or to a draft genome of the bacterium Ralstonia solanacearum were identified (Supplementary Table 2).On a closer inspection of the Ralstonia draft genome, we observed that the draft genome contains many sequences resembling Illumina adapters, which create artifactual alignments (Figure S2).A few sequences that were originally classified as cytomegalovirus (Human herpesvirus 5), Esptein-Barr virus (Human herpesvirus 4), parvoviruses or Paramecium chlorella bursaria virus were also confirmed as such during re-alignments (Supplementary Table 2, colored red).Thus, it is possible that such viruses are bona fide colonizers of CSF.In addition, sequences originally classified as Encephalomyocarditis virus were reclassified as Human simplex virus (Human herpesvirus 1).Some viral sequences that likely are associated with food, including Beet curly top virus and Iridoviruses also were confirmed during realignments.

Bacterial and Human Cells Were Scarce in CSF
Reads mapping to the human genome spanned all chromosomes (Figures 4A,B).However, bacterial and human sequences detected in CSF need not be from intact circulating cells; they can also be from cell-free DNA.To distinguish between these two possibilities, we stained CSF with Alexa Fluor-594 phalloidin and DAPI.Phalloidin specifically stains human cells, while DAPI intercalates the DNA of both human and bacterial cells, which can ultimately be distinguished by size.We first prepared a control sample that was spiked with Escherichia coli and human THP-1 monocytes, and both human (red) and bacterial (blue) cells were readily detected after staining (Figure 4C).However, when we stained CSF samples, entities resembling stained bacteria or human cells were rarely found under fluorescence microscopy [Figure 4D (human and bacterial cells) and Figure 4E (human cells)].Indeed, we screened all CSF samples described in Table 1, and with the exception of sample 26MS, we were unable to detect cell-or bacteria-like structures.

DISCUSSION
It has been a long-standing debate whether or not pathogens are involved in the etiology of multiple sclerosis (MS; Swanborg et al., 2003;Ramagopalan and Sadovnick, 2011;O'Gorman et al., 2012).Several lines of evidence suggest an infectious etiology.First, immunoglobulins with specificity against viruses or bacteria have been found in CSF from MS patients (Belbasis et al., 2015).Second, several bacteria, fungi, and viruses have been linked with demyelination (Johnson, 1994;Greenlee and Rose, 2000;Purzycki and Shain, 2010).Third, the differential geographical distribution of the disease (Giovannoni et al., 2006), the high incidence in developed countries (consistent with the hygiene hypothesis; Gilden, 2005), and changing incidence in subjects born in different seasons of the year (Willer et al., 2005), all might point toward pathogens as causal agents of MS.
A number of studies have reported the presence of pathogens in cerebrospinal fluid (CSF) from MS patients (Merelli et al., 1997;Moore and Wolfson, 2002;Swanborg et al., 2003;Stuve et al., 2004;Holmoy and Vartdal, 2005;Purzycki and Shain, 2010;Morandi et al., 2015), but the reproducibility of these experiments was often lacking (Casserly et al., 2007;Lindsey and Patel, 2008;Belbasis et al., 2015).Interestingly, a metaanalysis only found evidence of association of MS with smoking, IgG seropositivity to Epstein-Barr virus (EBV) nuclear antigen (EBNA) and infectious mononucleosis (Belbasis et al., 2015).Several experimental limitations might account for such a discrepancy.First, it may be that antibodies or PCR primers used for detection of pathogens lacked specificity and picked up background signal.Second, environmental contaminants are always a liability when conducting diagnostics in clinical samples.We applied metagenomics for the unbiased detection of microbial DNA in CSF samples from patients diagnosed with MS or other neurological conditions used as a comparison group.
Despite the limited size of the cohort of patients analyzed in this study (28 MS and 15 non-MS patients), our results suggest that CSF is often (but not always) free of bacteria and viruses, but always contain human DNA.However, we also need to entertain the hypothesis that some of the patients from our comparison group contained a higher frequency of microbial reads for some unknown reasons.The low background of bacterial reads observed across all libraries (<1%) likely represents cross-contamination events or environmental contamination that could have happened during sample collection, nucleic acids extraction, library preparation, or carry over in the sequencing instrument (i.e., from previous sequencing runs with libraries that contained the same barcodes used in this study).Contamination of libraries is a major concern when conducting next generation sequencing experiments (Jovel et al., 2016).Samples with more than 1% of bacterial reads likely reflect bona fide samples colonized with bacteria and/or bacterial DNA, with the highly abundant ones being less likely of representing contamination events.
Only one out of 28 MS samples contained a significant fraction of bacterial reads (83.25%), while eight out of 15 non-MS samples contained more than 1% of bacterial reads.On average, 19 and 24 PCR cycles were required for amplification of MS and non-MS libraries.The larger number of re-amplification cycles for non-MS samples means that such samples were manipulated more intensively than MS ones (i.e. each amplification PCR is followed by a clean up step with paramagnetic beads to remove low molecular weight material), which implies that nonabundant environmental contaminants may have been amplified and chances for cross-contamination increase.In the non-MS sample with high load of bacterial DNA (3C), the most abundant species was V. parvula, a non-fermentative gram-negative coccus that is usually found colonizing the oral cavity, GI tract and vagina, but occasionally has been reported associated with meningitis (Bhatti and Frank, 2000) and other opportunistic infectious diseases in immunocompromised individuals (Strach et al., 2006).The most abundant species in sample 26MS was K. flava, an actinobacterium previously described as an airborne or a soil bacterium (Zhou et al., 2008;Achala et al., 2011).Thus, the association of such bacterium with MS is unclear.In summary, both Veillonela parvula in sample 3C and Kocuria spp. in sample 26MS are likely genuine bacteria colonizing the CSF of these patients.Literature reports on a putative role of bacteria on MS are rather discordant and hint to M. pneumoniae, C. pneumonia, and S. aureus as possible causal agents (Abramovitz et al., 1987;Brocke et al., 1993;Sriram et al., 1999;Libbey et al., 2014); this is incongruent with our main results.
A few viral sequences belonging to Epstein-Barr virus, cytomegalovirus and parvoviruses may be truly associated with CSF and may be involved in the etiology of MS, as previously suggested (Warren et al., 1977;Merelli et al., 1997;Moore and Wolfson, 2002;Swanborg et al., 2003;Stuve et al., 2004;Franciotta and Lolli, 2005;Gilden, 2005;Giovannoni et al., 2006).That only few viral-like sequences were detected in our libraries is not unexpected; this is dictated by the overwhelmingly much higher relative abundance of human and bacterial reads.

MATERIALS AND METHODS
Study protocols and consent form was approved by the Health Research Ethics Board, University of Alberta (HREBA).All patients read and signed the consent form before sample collection.

DNA Extraction, Construction of Libraries, and Sequencing
DNA from 48 multiple sclerosis (MS) patients and 30 patients with other neurological conditions was extracted from 200 µl of cerebrospinal fluid (CSF) using the Qiagen blood DNA mini kit, according to manufacturer's instructions.Only 28 MS samples produced DNA in amounts measurable by Qubit dsDNA HS assay kit (Invitrogen), all of them in the sub-nanogram per µl range.Only two non-MS samples produced DNA in measurable quantities (∼0.3 ng/µl).
For construction of metagenomics libraries, the Nextera XT (Illumina) technology was used.In brief, 2 ng of DNA were tagmented (or whatever was available in 5 µl of DNA suspension) with 5 µl ATM for 5 min at 55 • C. The tagmentation reaction was stopped with 5 µl of NT buffer and further incubation for 5 min at room temperature.Subsequently, the tagmentation products were PCR amplified in the presence of a distinct barcode for each sample, for 19 cycles in the case of MS, while non-MS samples needed on average 24 amplification cycles to obtain libraries in quantities sufficient for sequencing.After removal of primer dimers and low molecular weight materials, libraries were diluted to a concentration of 4 nM and pooled for sequencing.Sequencing was conducted in a MiSeq (Illumina) instrument, using a 600 cycles V3 kit and a workflow that includes adapter removal and demultiplexing.

Bioinformatics Analyses
Demultiplexed reads were aligned against a customized database using the ultrafast metagenomic classification package Kraken (Wood and Salzberg, 2014), and abundance estimates were generated with the package Bracken (Lu et al., 2016), which performs a Bayesian re-estimation of abundance after classification with Kraken.Plots were generated either with Krona (Ondov et al., 2011; taxonomy plots) or using in-house R scripts.Reads that were reported by Kraken as virus hits were extracted from the row data and then aligned with BLAST against a recent version of the nt database of NCBI and top hits were retrieved for taxonomy comparison against Kraken classification by visual inspection of the corresponding alignments.

FIGURE 1 |
FIGURE 1 | Taxonomic classification of samples 3C (A) and 26MS (B) performed by Kraken and plotted with Krona.

FIGURE 2 |
FIGURE 2 | Frequency histogram of relative abundance of reads aligning to different parts of the genome of Veillonella parvula (A) or Kocuria flava (B).

FIGURE 3 |
FIGURE 3 | Illustration of compositional relationship of samples as assessed by principal component analysis (A) and hierarchical clustering (B).In (C), a close-up of some bacterial taxa frequently detected is shown.

FIGURE 4 |
FIGURE 4 | Distribution of human reads along all chromosomes of the genome in non-MS (A), or MS (B) samples.(C) Staining of control bacterial (Escherichia coli) and human (THP-1 monocytes) cells with phalloidin and DAPI.Human cells are shown in red and bacterial cells are shown in blue.(D) A few bacterial cell-like structures and one human cell-like structure in sample 26MS.(E) A single cluster of putative human cells in sample 26MS from a different area of the slide.All other samples did not have any visible structure resembling human or bacterial cells.White and cyan arrowheads point to human and bacteria cells in (C) and to putative human and bacteria cells in (D).

TABLE 1 |
Description of samples and results of Kraken alignments in regards to the most abundant groups detected (human, bacteria, viruses, and archaea).

TABLE 2 |
Mean, standard deviation, and median abundance of human, bacterial, viral, and archaea DNA per group.