<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fgene.2015.00016</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Methods Article</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Mining for viral fragments in methylation enriched sequencing data</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Mensaert</surname> <given-names>Klaas</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://community.frontiersin.org/people/u/136895"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Van Criekinge</surname> <given-names>Wim</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Thas</surname> <given-names>Olivier</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Schuuring</surname> <given-names>Ed</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Steenbergen</surname> <given-names>Renske D.M.</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Wisman</surname> <given-names>G. Bea A.</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>De Meyer</surname> <given-names>Tim</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://community.frontiersin.org/people/u/136896"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University</institution> <country>Ghent, Belgium</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Pathology, VU University Medical Center</institution> <country>Amsterdam, Netherlands</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Pathology, Cancer Research Center Groningen, University of Groningen, University Medical Center Groningen</institution> <country>Groningen, Netherlands</country></aff>
<aff id="aff4"><sup>4</sup><institution>Department of Gynecologic Oncology, Cancer Research Center Groningen, University of Groningen, University Medical Center Groningen</institution> <country>Groningen, Netherlands</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Mark D. Robinson, University of Zurich, Switzerland</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Ping Ma, University of Georgia, USA; Mattia Pelizzola, Istituto Italiano di Tecnologia, Italy</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Tim De Meyer, BioBix, Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, Ghent 9000, Belgium e-mail: <email>tim.demeyer&#x00040;ugent.be</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Genetics.</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>04</day>
<month>02</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<volume>6</volume>
<elocation-id>16</elocation-id>
<history>
<date date-type="received">
<day>30</day>
<month>06</month>
<year>2014</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>01</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2015 Mensaert, Van Criekinge, Thas, Schuuring, Steenbergen, Wisman and De Meyer.</copyright-statement>
<copyright-year>2015</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract><p>Most next generation sequencing experiments generate more data than is usable for the experimental set up. For example, methyl-CpG binding domain (MBD) affinity purification based sequencing is often used for DNA-methylation profiling, but up to 30% of the sequenced fragments cannot be mapped uniquely to the reference genome. Here we present and evaluate a methodology for the identification of viruses in these otherwise unused paired-end MBD-seq data. Viral detection is accomplished by mapping non-reference alignable reads to a comprehensive set of viral genomes. As viruses play an important role in epigenetics and cancer development, 92 (pre)malignant and benign samples, originating from two different collections of cervical samples and related cell lines, were used in this study. These samples include primary carcinomas (<italic>n</italic> &#x0003D; 22), low- and high-grade cervical intraepithelial neoplasia (CIN1 and CIN2/3 - <italic>n</italic> &#x0003D; 2/<italic>n</italic> &#x0003D; 30) and normal tissue (<italic>n</italic> &#x0003D; 20), as well as control samples (<italic>n</italic> &#x0003D; 17). Viruses that were detected include phages, adenoviruses, herpesviridae and HPV. HPV, which causes virtually all cervical cancers, was identified in 95% of the carcinomas, 100% of the CIN2/3 samples, both CIN1 samples and in 55% of the normal samples. Comparing the amount of mapped fragments on HPV for each HPV-infected sample yielded a significant difference between normal samples and carcinomas or CIN2/3 samples (adjusted <italic>p</italic>-values resp. &#x0003C;10<sup>&#x02212;5</sup>, &#x0003C;10<sup>&#x02212;5</sup>), reflecting different viral loads and/or methylation degrees in non-normal samples. Fragments originating from different HPV types could be distinguished and were independently validated by PCR-based assays in 71% of the detections. In conclusion, although limited by the a priori knowledge of viral reference genome sequences, the proposed methodology can provide a first confined but substantial insight into the presence, concentration and types of methylated viral sequences in MBD-seq data at low additional cost.</p></abstract>
<kwd-group>
<kwd>viruses</kwd>
<kwd>epigenomics</kwd>
<kwd>DNA-methylation</kwd>
<kwd>next generation sequencing</kwd>
<kwd>bioinformatics</kwd>
<kwd>cervical cancer</kwd>
<kwd>human papillomavirus</kwd>
<kwd>MBD-seq</kwd>
</kwd-group>
<counts>
<fig-count count="4"/>
<table-count count="5"/>
<equation-count count="0"/>
<ref-count count="59"/>
<page-count count="9"/>
<word-count count="7196"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction" id="s1">
<title>1. Introduction</title>
<p>The advent of next generation sequencing (NGS) has initiated a revolution in molecular biology. Due to massively parallel sequencing, new insights could be revealed in genetics, transcriptomics and more recently epigenomics. However, the processing of the sheer amount of data produced by these methods proved to be a challenge. Identification of nucleotide sequences is often the first step in many NGS analyses, yet a substantial fraction cannot be properly identified. These unidentified fragments might arise from low-complexity regions (e.g., repeats), bacteria, viruses, other organisms or artificial noise (e.g., adaptor dimers, Head et al., <xref ref-type="bibr" rid="B518">2014</xref>).</p>
<p>Previous studies have identified viruses by screening reads of RNA-seq from human samples. With this approach, the occurrence of EBV and CMV could be demonstrated in colorectal cancer and a landscape of viruses could be identified in a range of cancers (Khoury et al., <xref ref-type="bibr" rid="B27">2013</xref>; Salyakina and Tsinoremas, <xref ref-type="bibr" rid="B41">2013</xref>). In this project, we interrogated fragments from methyl-CpG binding domain enrichment based sequencing (MBD-seq) for a putative viral origin, thereby evaluating whether a similar approach could also be successful for DNA methylation studies.</p>
<p>MBD-seq is a methodology for the detection of CpG-methylation, an epigenetic modification that is essential for cellular differentiation and in processes such as genomic imprinting, X-chromosome inactivation and silencing of transposable elements (Jones, <xref ref-type="bibr" rid="B26">2012</xref>). This method is based on the enrichment of CpG methylated fragments using methyl binding domains followed by massive parallel sequencing. By mapping these fragments to a reference genome, the putatively methylated locus can be determined. Though affected by several biases, the amount of mapped fragments to a locus can be considered as a proxy for the methylation degree of that locus. MBD-seq has been shown to be sufficiently sensitive, specific and cost effective for genome-wide studies (Serre et al., <xref ref-type="bibr" rid="B44">2010</xref>; Aberg et al., <xref ref-type="bibr" rid="B1">2012</xref>).</p>
<p>Viruses play an important role in public health. Aside from causing infectious disease, some are known to be clear risk factors for the development of cancer. Currently known oncoviruses include human papilloma virus (HPV), Epstein-Barr virus (EBV), Kaposi&#x00027;s sarcoma associated herpesvirus (KSHV), Human cytomegalovirus (CMV) and Merkel cell polyomavirus (MCP). It is estimated that viruses have a causal role in about 16% of all human cancers (Schiller and Lowy, <xref ref-type="bibr" rid="B42">2010</xref>; de Martel et al., <xref ref-type="bibr" rid="B9">2012</xref>). Therefore, prevention and vaccination for these viral infections could prevent the occurrence of the cancers they cause. Viral DNA detection has been previously achieved by a range of other methods (Bexfield and Kellam, <xref ref-type="bibr" rid="B5">2011</xref>). State-of-the art methods are particularly sequencing based, for example combined with enrichment techniques or ultra deep sequencing (Allander et al., <xref ref-type="bibr" rid="B2">2001</xref>; John et al., <xref ref-type="bibr" rid="B24">2011</xref>; Lysholm et al., <xref ref-type="bibr" rid="B34">2012</xref>). Enrichment based methods are however dependent on viral particles, which restrains them from detecting integrated viruses. Deep sequencing on the other hand gives an unbiased representation, but severely reduces the efficiency (Willner and Hugenholtz, <xref ref-type="bibr" rid="B56">2013</xref>). With the advent of sequencing based viral research, also the need for specific bioinformatics tools became urgent (Fancello et al., <xref ref-type="bibr" rid="B512">2012</xref>).</p>
<p>CpG methylation is known to play various roles in the life cycle of viruses and their oncogenicity (Hoelzer et al., <xref ref-type="bibr" rid="B20">2008</xref>; Poreba et al., <xref ref-type="bibr" rid="B39">2011</xref>). For example, papillomaviruses are generally hypomethylated when being actively replicated, but are heavily methylated while inserted into the host genome (Hoelzer et al., <xref ref-type="bibr" rid="B20">2008</xref>). HPV might be mediating the methylation of its own genome, as HPV16&#x00027;s viral protein E7 is found to bind and stimulate the activity of DNA methyltransferase 1 (Dnmt1) (Burgers et al., <xref ref-type="bibr" rid="B6">2007</xref>). Also, viral DNA hypermethylation of HPV is more prominent in carcinomas than in asymptomatic infections or dysplasia (Fernandez et al., <xref ref-type="bibr" rid="B515">2009</xref>; Marongiu et al., <xref ref-type="bibr" rid="B35">2014</xref>). In EBV, hypermethylation helps to hide its presence by inhibiting expression of viral latency proteins that could be recognized by cytotoxic T-cells (Paulson and Speck, <xref ref-type="bibr" rid="B38">1999</xref>). Even the latency stage and the tumor type are associated with different methylation patterns of the EBV genome (zur Hausen, <xref ref-type="bibr" rid="B59">2006</xref>). Adenoviruses have also been proven to be <italic>de novo</italic> methylated by insertion, but never in a free DNA stage (Doerfler, <xref ref-type="bibr" rid="B511">2009</xref>). As several tumor-promoting and potentially methylable viruses remain to be identified, we aim at identifying viruses in the typically ignored non-reference aligned sequence reads of MBD-seq experiments.</p>
<p>Here, we demonstrate the usability and relevance of this approach on a collection of cervical samples, including cervical cancer and cervical intraepithelial neoplasia (CIN), which are putative cervical cancer precursors (Steenbergen et al., <xref ref-type="bibr" rid="B48">2014</xref>). Cervical cancer is the third most occurring cancer among women worldwide and estimated prevalences of HPV in cervical cancer range above 99%, strongly supporting the causal role of HPV in cancer development (Walboomers et al., <xref ref-type="bibr" rid="B53">1999</xref>; Ferlay et al., <xref ref-type="bibr" rid="B514">2010</xref>). Cervical tissue is known to be frequently infected by HPV (Clifford et al., <xref ref-type="bibr" rid="B8">2005</xref>) and HPV is often methylated (Hoelzer et al., <xref ref-type="bibr" rid="B20">2008</xref>). Therefore, cervical samples make an ideal test set for the detection of methylated viruses, HPV in particular.</p>
</sec>
<sec sec-type="materials and methods" id="s2">
<title>2. Materials and methods</title>
<sec>
<title>2.1. Samples and MBD-seq</title>
<p>Of the 92 samples, 39 samples originated from the VU University Medical Center (VUmc) in Amsterdam, further referred to as Set 1. Of this set, 10 samples were obtained from carcinoma, 12 are high-grade cervical intraepithelial neoplasia (CIN2/3), 3 are low-grade cervical intraepithelial neoplasia (CIN1) and 15 originate from cell cultures (See Table <xref ref-type="table" rid="T1">1</xref>). These included 2 isolates of primary human foreskin keratinocytes (labeled EK), 10 DNA isolates of keratinocytes transfected with full length HPV16 and HPV18 DNA and the plasmid pcDNAIneo (Invitrogen) (different passages of cell lines FK16A, FK16B, FK18A, FK18B; Steenbergen et al., <xref ref-type="bibr" rid="B46">1996</xref>), 2 DNA isolates of keratinocytes transduced with HPV16E6E7 cloned in the retroviral vector LZRS-MS-IERS-NEO/pBr (Kim et al., <xref ref-type="bibr" rid="B28">2006</xref>; Steenbergen et al., <xref ref-type="bibr" rid="B47">2013</xref>) and the cervical cancer cell line SiHa. In addition, 52 samples were collected from patients visiting the Department Gynecologic Oncology of the University Medical Center Groningen (UMCG), further referred to as Set 2. Of these samples, 12 samples are from carcinomas, 18 from High-grade cervical intraepithelial neoplasia (CIN2/3), 2 from leukocytes and 20 from normal cervical tissue. The two leukocyte samples were pooled samples from each 2 persons. This study has been approved by the ethical committees of UMCG and VUmc, adhering to the declaration of Helsinki.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p><bold>Overview of the histological sample groups and their origin</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="left"><bold>Cell culture</bold></th>
<th valign="top" align="center"><bold>Carcinoma</bold></th>
<th valign="top" align="center"><bold>CIN2/3</bold></th>
<th valign="top" align="center"><bold>CIN1</bold></th>
<th valign="top" align="center"><bold>Normal</bold></th>
<th valign="top" align="center"><bold>Leukocyte</bold></th>
<th valign="top" align="center"><bold>Total</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Set 1</td>
<td valign="top" align="center">15</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">12</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">39</td>
</tr>
<tr>
<td valign="top" align="left">Set 2</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">12</td>
<td valign="top" align="center">18</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">52</td>
</tr>
<tr>
<td valign="top" align="left">Total</td>
<td valign="top" align="center">15</td>
<td valign="top" align="center">22</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">91</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>To obtain the DNA methylation profiles, the MethylCap kit from Diagenode was combined with Illumina Genome Analyzer IIx paired-end sequencing as described in (De Meyer et al., <xref ref-type="bibr" rid="B510">2013</xref>) except for using 500 ng of input DNA instead of 200 ng. Due to data corruption in a compressed format, data for one CIN1 (complete) and one normal (partially) sample were unavailable for further processing. Therefore, only 2 CIN datasets were available, resulting in a total of 91 samples for analysis. Bowtie 1.0.0 was used to subsequently map the obtained paired-end reads (51 bp) from fastq-files to the human reference genome of NCBI v37 (Langmead et al., <xref ref-type="bibr" rid="B31">2009</xref>). A maximum insert size was set at 400 bp and up to 3 mismatches were allowed in the seed sequence to avoid too stringent mapping. For other parameters, the default settings were used. DNA molecules for which the paired-end reads could not be mapped to the reference genome will be further referred to as &#x0201C;non-canonical&#x0201D; fragments, whereas &#x0201C;canonical&#x0201D; fragments will be used to refer to fragments that could be aligned to the reference genome. The non-canonical fragments can be obtained from our website (<ext-link ext-link-type="uri" xlink:href="http://www.biobix.be/viralmbd/">http://www.biobix.be/viralmbd/</ext-link>).</p>
</sec>
<sec>
<title>2.2. Virus detection</title>
<p>We aimed to identify fragments of viral origin. This was achieved by searching for sequence similarity between the non-canonical reads and a set of viral reference genomes. For this purpose, we used FR-HIT (Niu et al., <xref ref-type="bibr" rid="B37">2011</xref>). All viral genomes from NCBI and EMBL-EBI were used for the construction of a set of viral reference genomes (<ext-link ext-link-type="uri" xlink:href="http://www.ebi.ac.uk/genomes/virus.html">http://www.ebi.ac.uk/genomes/virus.html</ext-link> &#x00026; <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi">http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi</ext-link>) (Wheeler et al., <xref ref-type="bibr" rid="B54">2006</xref>; Leinonen et al., <xref ref-type="bibr" rid="B32">2011</xref>). For reference genomes with a sequence similarity of over 95% (cut-off), the shortest genomes were removed with CD-HIT-EST (Fu et al., <xref ref-type="bibr" rid="B516">2012</xref>). This prevents a bias for those fragments for which there are more similar reference genomes. Mapping of paired reads on different, but very similar reference genomes are not being withheld and would therefore otherwise create false negatives. In order to diminish false positive identifications, several precautions were taken. First, reads featured by low complexity (dust-score &#x0003E;4) were filtered out with prinseq-lite (Schmieder and Edwards, <xref ref-type="bibr" rid="B43">2011</xref>). Second, FR-hit was forced to utilize the complete reads by using the &#x0201C;global mapping&#x0201D; strategy and only the best hits with an <italic>e</italic>-value smaller than 10<sup>&#x02212;5</sup> were used. Finally, only if both best hits from each paired-end originated from the same virus, viral identification was affirmed. Duplicated fragments, which have the same start for their first read and the same end position for their second read, were removed. By default, FR-hit masks reference sequences for low complexity regions, however since such a filtering is performed on the reads, this function was disabled. The end result of this approach is a dataset of virus (<italic>v</italic>) specific counts (<italic>N</italic><sub><italic>vs</italic></sub>) for each sample (<italic>s</italic>). Whenever we observed an <italic>N</italic><sub><italic>vs</italic></sub> &#x0003E; 0, we reported the virus to be present for that sample. Scripts for the execution of the pipeline can be found here: <ext-link ext-link-type="uri" xlink:href="https://github.com/klamens/viral-pipeline"><monospace>https://github.com/klamens/viral-pipeline</monospace></ext-link>.</p>
</sec>
<sec>
<title>2.3. Statistical tests</title>
<p>Testing for association between histological origin (carcinoma, CIN2/3, CIN1, normal) and the presence of HPV in a sample was performed by Pearson&#x00027;s Chi-squared test with 2000 simulated permutations. Association of high-risk HPV type occurrence and histological groups was tested as well. The most abundant HPV type per sample was used for the assessment of high/low risk HPV type occurrence. When abundances of the most and second most abundant type were equal and their risk was different, the sample was rejected for testing. For a comparison of the fraction of viral fragments, <italic>N</italic><sub><italic>vs</italic></sub>-values were normalized relative to the total fraction of sequenced fragments. These normalized fractions are denoted as <italic>R</italic><sub><italic>vs</italic></sub>. The fractions of counts mapped to specific viruses were compared and tested for with the Kruskal-Wallis Test between the different histological groups. These groups included samples from carcinoma, CIN2/3, CIN1, normal and only for HERV-K113 also cell cultures and leukocytes. <italic>Post-hoc</italic> analyses were performed with the Mann-Whitney-Wilcoxon Test and <italic>p</italic>-values were adjusted for multiple testing by Bonferroni correction (Hochberg, <xref ref-type="bibr" rid="B519">1988</xref>). For both the Kruskal-Wallis Test and the Mann-Whitney-Wilcoxon Test, a location shift assumption was made, resulting in testing for a difference between the medians of <italic>R</italic><sub><italic>vs</italic></sub>. Statistical analyses and graphical plot creations were performed within the statistical environment R (Wickham, <xref ref-type="bibr" rid="B55">2009</xref>; R Core Team, <xref ref-type="bibr" rid="B40">2012</xref>).</p>
</sec>
<sec>
<title>2.4. HPV type verification</title>
<p>Samples of Set 1 were assessed for HPV (type) presence using the GP5&#x0002B;/6&#x0002B; PCR followed by enzyme immunoassay (EIA) read-out system using a probe cocktail of 14 high-risk HPV types (HPV16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68) (Jacobs et al., <xref ref-type="bibr" rid="B23">1997</xref>). Reverse line blot was used to genotype all EIA-positive samples (van den Brule et al., <xref ref-type="bibr" rid="B51">2002</xref>) using probes for HPV-types 6, 11, 16, 18, 26, 30, 31, 33, 34, 35, 39, 40, 43, 45, 51, 52, 53, 54, 55, 56, 57, 58, 59, 61, 66, 67, 68, 69, 70, 71, 73, 81, and 82. Samples of Set 2 were tested for presence of high-risk HPV-DNA with both the HPV GP5&#x0002B;/6&#x0002B; general primers, and HPV16- and HPV18-specific primers (Wisman et al., <xref ref-type="bibr" rid="B57">2006</xref>) as performed routinely in the ISO-15189 accredited laboratory. In all tests a serial dilution of DNA extracted from CaSki (ATCC; CRL1550; 500 integrated HPV16 copies), HeLa (ATCC; CCL2; 20&#x02013;50 integrated HPV 18 copies), SiHa (ATCC; HTB35; 1&#x02013;2 integrated HPV16 copies), CC10B (HPV45-positive cell line) and CC11 (HPV67 positive cell line), and HPV-negative cell lines were included as control for the analytical specificity and sensitivity of each hrHPV-PCR (Tjon Pian Gi et al., <xref ref-type="bibr" rid="B50">2014</xref>). To assess the MBD-seq based HPV type identification, concordances for samples with and without the specific HPV types were calculated.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>3. Results</title>
<sec>
<title>3.1. Non-canonical fragments</title>
<p>On average, 29% (<italic>SD</italic> &#x0003D; 9%) of all fragments in each sample could not be aligned to the human reference genome. Of these reads, only 0.31% (<italic>SD</italic> &#x0003D; 0.17%) could be mapped to the viral reference genomes. In total, we tried to map reads of 4.3&#x000D7;10<sup>8</sup> non-canonical fragments to 6433 different viral genomes, obtained after removal of very similar genomes (see Materials and Methods). More details about the mapping statistics can be found in the Supplementary Material. As MBD-seq enriches for methylated CpGs, a high-quality dataset should include only a limited amount of fragments without any CpG, and most fragments should have multiple CpGs (De Meyer et al., <xref ref-type="bibr" rid="B510">2013</xref>). This holds for both sample sets (1 and 2) as depicted in Figure <xref ref-type="fig" rid="F1">1A</xref>. Differences in the number of CpGs per fragment per sample between Sets 1 and 2 can be explained by differences in fragment length (Figure <xref ref-type="fig" rid="F1">1B</xref>). Overall, this analysis suggests that most identified viruses (see below) are indeed methylated.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>Fragment CpG content</bold>. <bold>(A)</bold> Histogram of CpG content per sample set. <bold>(B)</bold> Relation of CpG content and length per sample.</p></caption>
<graphic xlink:href="fgene-06-00016-g0001.tif"/>
</fig>
</sec>
<sec>
<title>3.2. Detected viruses</title>
<p>In a first phase, the presence of specific viruses in the different sample sets was assessed (see Table <xref ref-type="table" rid="T2">2</xref>). For all samples, fragments similar to the human endoretrovirus K113 (HERV-K113) could be identified. However, as sequence identities of mapped reads with HERV-K113 were sometimes as low as 75%, it is most likely that these fragments originate from other HERV-Ks as well. A significant difference in <italic>R</italic><sub><italic>vs</italic></sub> for these HERV-K113 similar fragments could be demonstrated between the histological groups (<italic>p &#x0003C; 0.0001</italic>), but <italic>post-hoc</italic> tests revealed only significantly higher HERV-K113-like fractions in the cell cultures compared to normal tissue, CIN2/3 and carcinomas (all <italic>p</italic> &#x02264; 0.001, data not shown).</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p><bold>Sample counts (<italic>N</italic><sub><italic>vs</italic></sub>) of relevant identified viruses</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="left"><bold>Cell culture</bold></th>
<th valign="top" align="center"><bold>Carcinoma</bold></th>
<th valign="top" align="center"><bold>CIN2/3</bold></th>
<th valign="top" align="center"><bold>CIN1</bold></th>
<th valign="top" align="center"><bold>Normal</bold></th>
<th valign="top" align="center"><bold>Leukocytes</bold></th>
<th valign="top" align="center"><bold>Total</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">HERV-K113</td>
<td valign="top" align="center">15</td>
<td valign="top" align="center">22</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">91</td>
</tr>
<tr>
<td valign="top" align="left">phage phiX174</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">29</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">80</td>
</tr>
<tr>
<td valign="top" align="left">Human adenoviruses</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">27</td>
</tr>
<tr>
<td valign="top" align="left">Merkel cell polypmavirus</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">Epstein-barr virus</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">11</td>
</tr>
<tr>
<td valign="top" align="left">Human cytomegalovirus</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">7</td>
</tr>
<tr>
<td valign="top" align="left">Human herpesvirus 1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
</tr>
<tr>
<td valign="top" align="left">Human herpesvirus 6</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
</tr>
<tr>
<td valign="top" align="left">Human herpesvirus 7</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
</tr>
<tr>
<td valign="top" align="left">Human papillomavirus</td>
<td valign="top" align="center">14</td>
<td valign="top" align="center">21</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">11</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">79</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Also phages were frequently observed in various samples, though in very low abundances in all cell culture samples and CIN1 samples. Enterobacteria phage PhiX174 is the most occurring phage. This isn&#x00027;t surprising, as PhiX174 is being used as a spike-in for quality and calibration control in the Illumina sequencing protocol. Other phages that were observed at lower levels were, among others, phage lambda and phage P1 (data not shown).</p>
<p>Human adenoviruses were discovered in multiple samples. More fragments were observed in samples originating from Set 1 compared to those of Set 2. The most occurring types were human adenovirus C and human adenovirus B. Two CIN2/3 samples contained a single fragment of the Merkel cell polyomavirus. Multiple, particularly carcinoma and CIN2/3, samples were found to contain one to 25 fragments of the Epstein-Barr virus. Human cytomegalovirus was only detected in cell culture samples. However, these fragments most likely originate from the CMV promoter which is included in the pcDNAI neo plasmid. Human herpes virus 1, 6, and 7 were also identified, each in just a single sample.</p>
<p>HPV was detected in all but one sample in the carcinoma group and the cell culture group. It was discovered in all samples originating from the CIN2/3 group and in 11 of the 20 normal samples. Also in both samples of CIN1 and in one of the two leukocyte samples HPV was detected. Association between the presence of HPV and cervical groups (excluding cell culture samples and leukocytes) was assessed for by Pearson&#x00027;s Chi-squared test with simulated (<italic>p</italic> &#x0003C; 0.001).</p>
<p>Next to assessing the (differential) presence of specific viruses, also a quantitative analysis can be performed. To illustrate the feasibility, HPV <italic>R</italic><sub><italic>vs</italic></sub> in HPV-positive samples were compared between the carcinoma, CIN2/3, CIN1 and normal groups (see also Figure <xref ref-type="fig" rid="F2">2</xref>). A significant difference between these groups was demonstrated (<italic>p</italic> &#x0003D; 0.0001). <italic>Post-hoc</italic> analyses reveal significant differences between the normal group and cell carcinoma and CIN2/3 samples (see Table <xref ref-type="table" rid="T3">3</xref>). It should be noted that the absence of significance for the comparisons with CIN1 may be explained by a lack of power (<italic>n</italic> &#x0003D; 2).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>Normalized HPV fragment counts (<italic>R</italic><sub><italic>vs</italic></sub>) within each sample for which HPV was found, per histological group</bold>.</p></caption>
<graphic xlink:href="fgene-06-00016-g0002.tif"/>
</fig>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p><bold>Comparison of HPV fragment counts between cervical histological groups</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="left"><bold>Carcinoma</bold></th>
<th valign="top" align="center"><bold>CIN2/3</bold></th>
<th valign="top" align="center"><bold>CIN1</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">CIN2/3</td>
<td valign="top" align="center">1</td>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">CIN1</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">1</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">Normal</td>
<td valign="top" align="center">&#x0003C;10<sup>&#x02212;5</sup></td>
<td valign="top" align="center">&#x0003C;10<sup>&#x02212;5</sup></td>
<td valign="top" align="center">0.8</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Values are p-values obtained by post-hoc Mann-Whitney-Wilcoxon test and adjusted by Bonferroni correction</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>Often multiple HPV types were detected per sample as can be observed in Figure <xref ref-type="fig" rid="F3">3A</xref> and Table <xref ref-type="table" rid="T4">4</xref>. In Figures <xref ref-type="fig" rid="F3">3B&#x02013;D</xref> one can see which HPV types were detected in each histological group. The most detected HPV type in primary cervical samples was HPV16 (<italic>N</italic> &#x0003D; 30), followed by HPV31 (<italic>N</italic> &#x0003D; 12), HPV39 (<italic>N</italic> &#x0003D; 9), HPV18 (<italic>N</italic> &#x0003D; 6), and HPV36 (<italic>N</italic> &#x0003D; 6). Other HPV types could not be observed in more than 4 different samples (see Figure <xref ref-type="fig" rid="F3">3</xref>). We observed a higher relative occurrence of high-risk HPV types in carcinoma and CIN2/3 samples with HPV compared to normal samples, but the association was not significant.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p><bold>(A)</bold> detection of HPV types: number of HPV types found per sample. <bold>(B&#x02013;D)</bold> Stacked barplot of HPV types found per sample with n-th most fragments within the different groups. Red to gold and blue colored types correspond with respectively high and low risk HPV types.</p></caption>
<graphic xlink:href="fgene-06-00016-g0003.tif"/>
</fig>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p><bold>Overview of the number of identified HPV types in the different sample groups</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="left"><bold>Cell culture</bold></th>
<th valign="top" align="center"><bold>Carcinoma</bold></th>
<th valign="top" align="center"><bold>CIN2/3</bold></th>
<th valign="top" align="center"><bold>CIN1</bold></th>
<th valign="top" align="center"><bold>Normal</bold></th>
<th valign="top" align="center"><bold>Leukocyte</bold></th>
<th valign="top" align="center"><bold>Total</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">No HPV</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">12</td>
</tr>
<tr>
<td valign="top" align="left">1 HPV type</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">12</td>
<td valign="top" align="center">17</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">43</td>
</tr>
<tr>
<td valign="top" align="left">2 HPV types</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">31</td>
</tr>
<tr>
<td valign="top" align="left">3 HPV types</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">Total</td>
<td valign="top" align="center">15</td>
<td valign="top" align="center">22</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">91</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Though the HPV type analysis yielded relevant results, the overall accuracy of this approach should be evaluated as well. Therefore, verification of the HPV types was performed using independent methods (see Materials and Methods), which we consider here as gold standard. The independent validation of the HPV types yielded a positive verification in 71% of the detections. For HPV types indicated to be present by these methods, results were 66% concordant with the MBD-seq approach. Vice versa, verified absence of viruses was 98% concordant with the proposed methodology. The more fragments that were detected for an identified HPV type, the more likely it was to be validated as can be seen in Table <xref ref-type="table" rid="T5">5</xref>. As the verification methodology differed between Sets 1 and 2, results per sample collection can be observed in Figure <xref ref-type="fig" rid="F4">4</xref>.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p><bold>Overview of the number of validated HPV types according to the amount of detected fragments</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="left"><bold>1</bold></th>
<th valign="top" align="center"><bold>2&#x02013;10</bold></th>
<th valign="top" align="center"><bold>11&#x02013;100</bold></th>
<th valign="top" align="center"><bold>&#x0003E;100</bold></th>
<th valign="top" align="center"><bold>Total</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Unvalidated</td>
<td valign="top" align="center">11</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">22</td>
</tr>
<tr>
<td valign="top" align="left">Validated</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">19</td>
<td valign="top" align="center">18</td>
<td valign="top" align="center">47</td>
</tr>
<tr>
<td valign="top" align="left">Total</td>
<td valign="top" align="center">15</td>
<td valign="top" align="center">12</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">22</td>
<td valign="top" align="center">69</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p><bold>HPV type validation</bold>.</p></caption>
<graphic xlink:href="fgene-06-00016-g0004.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>4. Discussion</title>
<p>In this study, we demonstrated that the non-canonical fraction of MBD-seq fragments can be used to identify viruses. Considering the increasing importance of sequencing methods, this strategy can provide key evidence regarding the involvement of specific viruses in pathologies at minimal additional cost. Given the roles of DNA methylation in virus biology, the outlined pipeline is capable to generate valuable hypotheses from otherwise unused data. As the outlined application has also several disadvantages (see below), the generated hypotheses should of course be additionally validated by state-of-the art methods. The observed CpG content in many cervical samples, in comparison with De Meyer et al. (<xref ref-type="bibr" rid="B510">2013</xref>), suggests that most viral mapped fragments are methylated. It should be noted that unmethylated viral fragments picked up as &#x0201C;noise&#x0201D; may also be relevant, but that the sensitivity for these fragments will most likely be too low to link it to the specific pathology under study.</p>
<p>Recently, some studies already achieved viral identification in RNA-seq experiments by comparable methods (Chen et al., <xref ref-type="bibr" rid="B7">2013</xref>; Salyakina and Tsinoremas, <xref ref-type="bibr" rid="B41">2013</xref>). These studies could find substantial presence of oncoviruses by their transcripts. However, integrated viruses may be temporarily transcriptionally silent, often by DNA methylation, making the proposed methodology a good complement to RNA-seq for viral identification as transcriptionally silenced viruses will also be detected. Moreover, it is capable of revealing epigenetic information about the clinical virus biology. Our method is generic and could be used in combination with other NGS techniques. However, FR-HIT does not account for splicing events which might restrict its applicability to RNA-seq data.</p>
<p>The outlined approach was used on cervical samples of different origin, both histologically and study-wise, and multiple viruses were detected. Not unexpectedly, fragments originating from HERV-K were observed in every sample, which can be considered as a positive control as HERV-K is an endogenous retrovirus (Hohn et al., <xref ref-type="bibr" rid="B21">2013</xref>). Significantly more HERV-K fragments could be observed in the cell culture samples vs. normal tissue, CIN2/3 and carcinomas, which might reflect methylation differences between cell culture and primary samples (Smiraglia, <xref ref-type="bibr" rid="B45">2001</xref>; Varley et al., <xref ref-type="bibr" rid="B52">2013</xref>). This result therefore provides a first indication that also a quantitative comparison of virus count data may yield relevant information. Other expected detections include Phage PhiX174 DNA from Illumina spike-ins and CMV that originated from the pcDNAI neo plasmid in cell culture samples. Indeed, plasmids have been shown to be methylated, which can interfere with specific experiments (Hong et al., <xref ref-type="bibr" rid="B22">2001</xref>).</p>
<p>Interestingly, we detected several oncoviruses in the cervical samples besides HPV. Merkel cell polyomavius, known to cause the Merkel cell sarcoma, was found to be present in two CIN2/3 samples (Feng et al., <xref ref-type="bibr" rid="B513">2008</xref>). Another identified oncovirus is the Epstein-Barr virus. Although not significant, an apparent association between the presence of the Epstein-Barr virus and histological group hints toward its oncogenic role in cervical cancer as has been stated in (Szostek et al., <xref ref-type="bibr" rid="B49">2009</xref>). However, since the counts for Epstein-Barr were low, viral fragments originating from infiltrating lymphocytes is at least an equivalent alternative (Grywalska et al., <xref ref-type="bibr" rid="B517">2013</xref>). Results from this study therefore indicate that additional research should be performed regarding the impact of Epstein-Barr virus and Merkel cell polyomavirus superinfection in CIN2/3 and carcinomas, preferably in far larger groups.</p>
<p>The most prevalent oncovirus however, as expected, was HPV. As the prevalence of HPV in cervix and its causal role in cervical cancer is well documented, the virus detection efficiency of the proposed methodology verifies the capabilities of our method (Clifford et al., <xref ref-type="bibr" rid="B8">2005</xref>; Armstrong, <xref ref-type="bibr" rid="B4">2010</xref>). The role of HPV in cervical cancer was shown by two comparisons. First, there is a significant association of HPV occurrence and histological group. Second, in HPV positive samples we observed a significant increase in total HPV fragments per sample in cell culture, carcinoma or CIN2/3 samples vs. normal samples. The latter observation could be due to more HPV and/or more HPV methylation. More DNA methylation of the HPV genome in carcinomas is in accordance with observations for HPV16 and HPV18 as reported by Fernandez et al. (<xref ref-type="bibr" rid="B515">2009</xref>).</p>
<p>However, note that the quantitative evaluation of methylated viruses may also be affected by the global genomic methylation state. Genomic hypermethylation, as often observed in cell lines (Smiraglia, <xref ref-type="bibr" rid="B45">2001</xref>), might suppress viral estimates as their relative abundance in the total methylated fraction may drop. On the other hand, overall hypermethylation may also lead directly to increased viral methylation, and therefore higher sensitivity for MBD-based capturing. A similar reasoning may be relevant for tumor samples, which might feature global hypomethylation (Li et al., <xref ref-type="bibr" rid="B33">2014</xref>). In other words, the overall methylation state will have an important impact, but the exact effect depends on how much viral methylation itself or the detection of methylation is affected by it.</p>
<p>Phages were detected in primary samples from both sample sets and were absent in all cell culture samples. This is not unexpected as the female genital tract is featured by complex microbiological flora and phage genomes have since long been reported to be methylated (Kr&#x000FC;ger and Bickle, <xref ref-type="bibr" rid="B30">1983</xref>; Martin et al., <xref ref-type="bibr" rid="B36">2012</xref>). The presence of human adenoviruses might be explained by contamination. Both human adenovirus B and C are known to play a role in respiratory diseases, which might explain a possible way of contamination (Jones et al., <xref ref-type="bibr" rid="B25">2007</xref>). The remarkable difference of human adenovirus fragment occurrence between the sample sets reinforces this hypothesis. Observation of HPV in one leukocyte sample might be explained by contamination as well.</p>
<p>Hence, for virus detection with a low fragment count, one should be cautious in concluding viral presence. The high sensitivity of NGS will cause the results regarding presence or not to be easily affected by contamination Yozwiak et al. (<xref ref-type="bibr" rid="B58">2012</xref>). For example, HPV39 was detected several times at low fragment count in samples that were run in the same illumina Genome Analyzer lane as one sample with a remarkable high HPV39 fragment count. Also, the high amount of HPV39 positive samples seems to deviate from its relative low prevalence in Europe, this in contrast with the other HPV types (16,18,31) (Clifford et al., <xref ref-type="bibr" rid="B8">2005</xref>). These fragments were most likely categorized in the wrong sample due to carryover associated with common inaccuracies in Illumina multiplex sequencing (Kircher et al., <xref ref-type="bibr" rid="B29">2012</xref>). Improper identifications due to wrong mapping is less likely as viral genomes with high similarity were represented by only one reference genome per group. Furthermore, we checked some of the single HPV hits by blasting them to NCBI nucleotide archive which gave us best hits for the found HPV&#x00027;s. Contamination might therefore partly explain the seemingly high superinfection rate of HPV types. One might therefore opt to only call virus presence upon identification with a minimum fragment count, for example 10 (as also suggested by Yozwiak et al., <xref ref-type="bibr" rid="B58">2012</xref> and Salyakina and Tsinoremas, <xref ref-type="bibr" rid="B41">2013</xref>). Additionally, the use of double indexing during Illumina multiplex sequencing will remove a major experimental source of carryover contamination (Kircher et al., <xref ref-type="bibr" rid="B29">2012</xref>). For example, HPV detections in samples of Set 1 with more than 10 fragments could all but two be verified. Alternatively, next to contamination, MBD-seq might also be featured by a higher sensitivity due to enrichment for methylation, compared to the methylation naive verification methods. However, it will likely not detect viruses of which no methylated DNA is present.</p>
<p>Another limitation of this best mapping hit based approach is that it enterily depends on existing known viral genomes. In this study, only full genomes of NCBI and ENA were used. However, as the portion of sequenced genomes (6433 in our dataset) is very limited compared to the amount of mammalian viruses estimated at 320.000 (Anthony et al., <xref ref-type="bibr" rid="B3">2013</xref>), it is very likely that many viruses will be missed by this method. Related viruses can be detected by lowering the stringency of sequence similarity. However, this implies an increasing difficulty to distinguish viral types. Distinct viral types will also be harder to distinguish when the set of reference genomes increases as more similar genomes enter. This problem can be solved by clustering and removing similar genomes or by technological advances that increase the length of the sequenced reads. Finally, also horizontal gene transfer or ancestral viral integrations may create false positives. <italic>De novo</italic> assembly of viruses using unmapped fragments largely avoids the dependency on current knowledge and mapping problems, but will require large coverages to obtain sufficient amounts of viral fragments and will be hampered by unmethylated regions of the viral genome.</p>
<p>Generally, we can conclude that this method is effective in detecting fragments of methylated viral DNA. This could be verified by HPV detection in the cervix case study, demonstrating (i) association of HPV presence and histological group (ii) differential quantities of HPV fragments in HPV positive samples between normal samples and carinoma or CIN2/3 samples (iii) type detection with good concordance as verified by independent methods. In other words, if the impact of HPV in cervical cancer would have been unknown, it might have been picked up by the outlined approach, though additional validation would of course have been absolutely necessary. It is therefore clear that the methodology can generate novel knowledge regarding the presence of viruses in disease, and that the inherent disadvantages are by far outweighed by the major benefit of obtaining information regarding the presence of any sequenced virus in otherwise typically discarded data.</p>
</sec>
<sec>
<title>Author contributions</title>
<p>Ed Schuuring, Renske D. M. Steenbergen and G. Bea A. Wisman provided the data regarding the cervical samples and performed the HPV type verifications for these samples. Tim De Meyer and Klaas Mensaert conceived the general idea and approach. Klaas Mensaert and Tim De Meyer have designed the pipeline, analyzed the data and wrote the manuscript. Wim Van Criekinge and Olivier Thas have contributed to the conceptual development and provided critical advice. All authors have reviewed the article and approved the final manuscript.</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p></sec>
</sec>
</body>
<back>
<ack>
<p>We acknowledge the financial support of the N2N Multidisciplinary Research Partnership of the University of Ghent for both Klaas Mensaert and Tim De Meyer.</p>
</ack>
<sec sec-type="supplementary-material" id="s5">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="http://www.frontiersin.org/journal/10.3389/fgene.2015.00016/abstract">http://www.frontiersin.org/journal/10.3389/fgene.2015.00016/abstract</ext-link></p>
<supplementary-material xlink:href="DataSheet1.PDF" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aberg</surname> <given-names>K. A.</given-names></name> <name><surname>McClay</surname> <given-names>J. L.</given-names></name> <name><surname>Nerella</surname> <given-names>S.</given-names></name> <name><surname>Xie</surname> <given-names>L. Y.</given-names></name> <name><surname>Clark</surname> <given-names>S. L.</given-names></name> <name><surname>Hudson</surname> <given-names>A. D.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>MBD-seq as a cost-effective approach for methylome-wide association studies: demonstration in 1500 case&#x02013;control samples</article-title>. <source>Epigenomics</source> <volume>4</volume>, <fpage>605</fpage>&#x02013;<lpage>621</lpage>. <pub-id pub-id-type="doi">10.2217/epi.12.59</pub-id><pub-id pub-id-type="pmid">23244307</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Allander</surname> <given-names>T.</given-names></name> <name><surname>Emerson</surname> <given-names>S. U.</given-names></name> <name><surname>Engle</surname> <given-names>R. E.</given-names></name> <name><surname>Purcell</surname> <given-names>R. H.</given-names></name> <name><surname>Bukh</surname> <given-names>J.</given-names></name></person-group> (<year>2001</year>). <article-title>A virus discovery method incorporating DNase treatment and its application to the identification of two bovine parvovirus species</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>98</volume>, <fpage>11609</fpage>&#x02013;<lpage>11614</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.211424698</pub-id><pub-id pub-id-type="pmid">11562506</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Anthony</surname> <given-names>S. J.</given-names></name> <name><surname>Epstein</surname> <given-names>J. H.</given-names></name> <name><surname>Murray</surname> <given-names>K. A.</given-names></name> <name><surname>Navarrete-Macias</surname> <given-names>I.</given-names></name> <name><surname>Zambrana-Torrelio</surname> <given-names>C. M.</given-names></name> <name><surname>Solovyov</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>A strategy to estimate unknown viral diversity in mammals</article-title>. <source>mBio</source> <volume>4</volume>:<fpage>e00598-13</fpage>. <pub-id pub-id-type="doi">10.1128/mBio.00598-13</pub-id><pub-id pub-id-type="pmid">24003179</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Armstrong</surname> <given-names>E. P.</given-names></name></person-group> (<year>2010</year>). <article-title>Prophylaxis of cervical cancer and related cervical disease: a review of the cost-effectiveness of vaccination against oncogenic HPV types</article-title>. <source>J. Manag. Care Pharm</source>. <volume>16</volume>, <fpage>217</fpage>&#x02013;<lpage>230</lpage>. <pub-id pub-id-type="pmid">20331326</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bexfield</surname> <given-names>N.</given-names></name> <name><surname>Kellam</surname> <given-names>P.</given-names></name></person-group> (<year>2011</year>). <article-title>Metagenomics and the molecular identification of novel viruses</article-title>. <source>Vet. J</source>. <volume>190</volume>, <fpage>191</fpage>&#x02013;<lpage>198</lpage>. <pub-id pub-id-type="doi">10.1016/j.tvjl.2010.10.014</pub-id><pub-id pub-id-type="pmid">21111643</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Burgers</surname> <given-names>W. A.</given-names></name> <name><surname>Blanchon</surname> <given-names>L.</given-names></name> <name><surname>Pradhan</surname> <given-names>S.</given-names></name> <name><surname>de Launoit</surname> <given-names>Y.</given-names></name> <name><surname>Kouzarides</surname> <given-names>T.</given-names></name> <name><surname>Fuks</surname> <given-names>F.</given-names></name></person-group> (<year>2007</year>). <article-title>Viral oncoproteins target the DNA methyltransferases</article-title>. <source>Oncogene</source> <volume>26</volume>, <fpage>1650</fpage>&#x02013;<lpage>1655</lpage>. <pub-id pub-id-type="doi">10.1038/sj.onc.1209950</pub-id><pub-id pub-id-type="pmid">16983344</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Yao</surname> <given-names>H.</given-names></name> <name><surname>Thompson</surname> <given-names>E. J.</given-names></name> <name><surname>Tannir</surname> <given-names>N. M.</given-names></name> <name><surname>Weinstein</surname> <given-names>J. N.</given-names></name> <name><surname>Su</surname> <given-names>X.</given-names></name></person-group> (<year>2013</year>). <article-title>VirusSeq: software to identify viruses and their integration sites using next-generation sequencing of human cancer tissue</article-title>. <source>Bioinformatics</source> <volume>29</volume>, <fpage>266</fpage>&#x02013;<lpage>267</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts665</pub-id><pub-id pub-id-type="pmid">23162058</pub-id></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Clifford</surname> <given-names>G. M.</given-names></name> <name><surname>Gallus</surname> <given-names>S.</given-names></name> <name><surname>Herrero</surname> <given-names>R.</given-names></name> <name><surname>Mu&#x00144;oz</surname> <given-names>N.</given-names></name> <name><surname>Snijders</surname> <given-names>P. J. F.</given-names></name> <name><surname>Vaccarella</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2005</year>). <article-title>Worldwide distribution of human papillomavirus types in cytologically normal women in the international agency for research on cancer HPV prevalence surveys: a pooled analysis</article-title>. <source>Lancet</source> <volume>366</volume>, <fpage>991</fpage>&#x02013;<lpage>998</lpage>. <pub-id pub-id-type="doi">10.1016/S0140-6736(05)67069-9</pub-id><pub-id pub-id-type="pmid">16168781</pub-id></citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Martel</surname> <given-names>C.</given-names></name> <name><surname>Ferlay</surname> <given-names>J.</given-names></name> <name><surname>Franceschi</surname> <given-names>S.</given-names></name> <name><surname>Vignat</surname> <given-names>J.</given-names></name> <name><surname>Bray</surname> <given-names>F.</given-names></name> <name><surname>Forman</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Global burden of cancers attributable to infections in 2008: a review and synthetic analysis</article-title>. <source>Lancet Oncol</source>. <volume>13</volume>, <fpage>607</fpage>&#x02013;<lpage>615</lpage>. <pub-id pub-id-type="doi">10.1016/S1470-2045(12)70137-7</pub-id><pub-id pub-id-type="pmid">22575588</pub-id></citation>
</ref>
<ref id="B510">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>De Meyer</surname> <given-names>T.</given-names></name> <name><surname>Mampaey</surname> <given-names>E.</given-names></name> <name><surname>Vlemmix</surname> <given-names>M.</given-names></name> <name><surname>Denil</surname> <given-names>S.</given-names></name> <name><surname>Trooskens</surname> <given-names>G.</given-names></name> <name><surname>Renard</surname> <given-names>J.-P.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Quality evaluation of methyl binding domain based kits for enrichment DNA-methylation sequencing</article-title>. <source>PLoS ONE</source> <volume>8</volume>:<fpage>e59068</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0059068</pub-id><pub-id pub-id-type="pmid">23554971</pub-id></citation>
</ref>
<ref id="B511">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Doerfler</surname> <given-names>W.</given-names></name></person-group> (<year>2009</year>). <article-title>Epigenetic mechanisms in human adenovirus type 12 oncogenesis</article-title>. <source>Semin. Cancer Biol</source>. <volume>19</volume>, <fpage>136</fpage>&#x02013;<lpage>143</lpage>. <pub-id pub-id-type="doi">10.1016/j.semcancer.2009.02.009</pub-id><pub-id pub-id-type="pmid">19429476</pub-id></citation>
</ref>
<ref id="B512">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fancello</surname> <given-names>L.</given-names></name> <name><surname>Raoult</surname> <given-names>D.</given-names></name> <name><surname>Desnues</surname> <given-names>C.</given-names></name></person-group> (<year>2012</year>). <article-title>Computational tools for viral metagenomics and their application in clinical research</article-title>. <source>Virology</source> <volume>434</volume>, <fpage>162</fpage>&#x02013;<lpage>174</lpage>. <pub-id pub-id-type="doi">10.1016/j.virol.2012.09.025</pub-id><pub-id pub-id-type="pmid">23062738</pub-id></citation>
</ref>
<ref id="B513">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Feng</surname> <given-names>H.</given-names></name> <name><surname>Shuda</surname> <given-names>M.</given-names></name> <name><surname>Chang</surname> <given-names>Y.</given-names></name> <name><surname>Moore</surname> <given-names>P. S.</given-names></name></person-group> (<year>2008</year>). <article-title>Clonal integration of a polyomavirus in human Merkel cell carcinoma</article-title>. <source>Science</source> <volume>319</volume>, <fpage>1096</fpage>&#x02013;<lpage>1100</lpage>. <pub-id pub-id-type="doi">10.1126/science.1152586</pub-id><pub-id pub-id-type="pmid">18202256</pub-id></citation>
</ref>
<ref id="B514">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferlay</surname> <given-names>J.</given-names></name> <name><surname>Shin</surname> <given-names>H.-R.</given-names></name> <name><surname>Bray</surname> <given-names>F.</given-names></name> <name><surname>Forman</surname> <given-names>D.</given-names></name> <name><surname>Mathers</surname> <given-names>C.</given-names></name> <name><surname>Parkin</surname> <given-names>D. M.</given-names></name></person-group> (<year>2010</year>). <article-title>Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008</article-title>. <source>Int. J. Cancer</source> <volume>127</volume>, <fpage>2893</fpage>&#x02013;<lpage>2917</lpage>. <pub-id pub-id-type="doi">10.1002/ijc.25516</pub-id><pub-id pub-id-type="pmid">21351269</pub-id></citation>
</ref>
<ref id="B515">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fernandez</surname> <given-names>A. F.</given-names></name> <name><surname>Rosales</surname> <given-names>C.</given-names></name> <name><surname>Lopez-Nieva</surname> <given-names>P.</given-names></name> <name><surname>Gra&#x000F1;a</surname> <given-names>O.</given-names></name> <name><surname>Ballestar</surname> <given-names>E.</given-names></name> <name><surname>Ropero</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>The dynamic DNA methylomes of double-stranded DNA viruses associated with human cancer</article-title>. <source>Genome Res</source>. <volume>19</volume>, <fpage>438</fpage>&#x02013;<lpage>451</lpage>. <pub-id pub-id-type="doi">10.1101/gr.083550.108</pub-id><pub-id pub-id-type="pmid">19208682</pub-id></citation>
</ref>
<ref id="B516">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>L.</given-names></name> <name><surname>Niu</surname> <given-names>B.</given-names></name> <name><surname>Zhu</surname> <given-names>Z.</given-names></name> <name><surname>Wu</surname> <given-names>S.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name></person-group> (<year>2012</year>). <article-title>CD-HIT: accelerated for clustering the next-generation sequencing data</article-title>. <source>Bioinformatics</source> <volume>28</volume>, <fpage>3150</fpage>&#x02013;<lpage>3152</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts565</pub-id><pub-id pub-id-type="pmid">23060610</pub-id></citation>
</ref>
<ref id="B517">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grywalska</surname> <given-names>E.</given-names></name> <name><surname>Markowicz</surname> <given-names>J.</given-names></name> <name><surname>Grabarczyk</surname> <given-names>P.</given-names></name> <name><surname>Pasiarski</surname> <given-names>M.</given-names></name> <name><surname>Roli&#x000C5;&#x0010E;ski</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>Epstein-Barr virus-associated lymphoproliferative disorders</article-title>. <source>Postepy Hig. Med. Dosw</source>. <volume>67</volume>, <fpage>481</fpage>&#x02013;<lpage>490</lpage>. <pub-id pub-id-type="doi">10.5604/17322693.1050999</pub-id><pub-id pub-id-type="pmid">23752600</pub-id></citation>
</ref>
<ref id="B518">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Head</surname> <given-names>S. R.</given-names></name> <name><surname>Komori</surname> <given-names>H. K.</given-names></name> <name><surname>LaMere</surname> <given-names>S. A.</given-names></name> <name><surname>Whisenant</surname> <given-names>T.</given-names></name> <name><surname>Van Nieuwerburgh</surname> <given-names>F.</given-names></name> <name><surname>Salomon</surname> <given-names>D. R.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Library construction for next-generation sequencing: overviews and challenges</article-title>. <source>Biotechniques</source> <volume>56</volume>, <fpage>61</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.2144/000114133</pub-id><pub-id pub-id-type="pmid">24502796</pub-id></citation>
</ref>
<ref id="B519">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hochberg</surname> <given-names>Y.</given-names></name></person-group> (<year>1988</year>). <article-title>A sharper Bonferroni procedure for multiple tests of significance</article-title>. <source>Biometrika</source> <volume>75</volume>, <fpage>800</fpage>&#x02013;<lpage>802</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/75.4.800</pub-id></citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoelzer</surname> <given-names>K.</given-names></name> <name><surname>Shackelton</surname> <given-names>L. A.</given-names></name> <name><surname>Parrish</surname> <given-names>C. R.</given-names></name></person-group> (<year>2008</year>). <article-title>Presence and role of cytosine methylation in DNA viruses of animals</article-title>. <source>Nucleic Acids Res</source>. <volume>36</volume>, <fpage>2825</fpage>&#x02013;<lpage>2837</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkn121</pub-id><pub-id pub-id-type="pmid">18367473</pub-id></citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hohn</surname> <given-names>O.</given-names></name> <name><surname>Hanke</surname> <given-names>K.</given-names></name> <name><surname>Bannert</surname> <given-names>N.</given-names></name></person-group> (<year>2013</year>). <article-title>HERV-K(HML-2), the Best preserved family of HERVs: endogenization, expression, and implications in health and disease</article-title>. <source>Front. Oncol</source>. <volume>3</volume>:<issue>246</issue>. <pub-id pub-id-type="doi">10.3389/fonc.2013.00246</pub-id><pub-id pub-id-type="pmid">24066280</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hong</surname> <given-names>K.</given-names></name> <name><surname>Sherley</surname> <given-names>J.</given-names></name> <name><surname>Lauffenburger</surname> <given-names>D. A.</given-names></name></person-group> (<year>2001</year>). <article-title>Methylation of episomal plasmids as a barrier to transient gene expression via a synthetic delivery vector</article-title>. <source>Biomol. Eng</source>. <volume>18</volume>, <fpage>185</fpage>&#x02013;<lpage>192</lpage>. <pub-id pub-id-type="doi">10.1016/S1389-0344(01)00100-9</pub-id><pub-id pub-id-type="pmid">11576873</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jacobs</surname> <given-names>M. V.</given-names></name> <name><surname>Snijders</surname> <given-names>P. J.</given-names></name> <name><surname>van den Brule</surname> <given-names>A. J.</given-names></name> <name><surname>Helmerhorst</surname> <given-names>T. J.</given-names></name> <name><surname>Meijer</surname> <given-names>C. J.</given-names></name> <name><surname>Walboomers</surname> <given-names>J. M.</given-names></name></person-group> (<year>1997</year>). <article-title>A general primer GP5&#x0002B;/GP6(&#x0002B;)-mediated PCR-enzyme immunoassay method for rapid detection of 14 high-risk and 6 low-risk human papillomavirus genotypes in cervical scrapings</article-title>. <source>J. Clin. Microbiol</source>. <volume>35</volume>, <fpage>791</fpage>&#x02013;<lpage>795</lpage>. <pub-id pub-id-type="pmid">9041439</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>John</surname> <given-names>S. G.</given-names></name> <name><surname>Mendez</surname> <given-names>C. B.</given-names></name> <name><surname>Deng</surname> <given-names>L.</given-names></name> <name><surname>Poulos</surname> <given-names>B.</given-names></name> <name><surname>Kauffman</surname> <given-names>A. K. M.</given-names></name> <name><surname>Kern</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>A simple and efficient method for concentration of ocean viruses by chemical flocculation</article-title>. <source>Environ. Microbiol. Rep</source>. <volume>3</volume>, <fpage>195</fpage>&#x02013;<lpage>202</lpage>. <pub-id pub-id-type="doi">10.1111/j.1758-2229.2010.00208.x</pub-id><pub-id pub-id-type="pmid">21572525</pub-id></citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jones</surname> <given-names>M. S.</given-names></name> <name><surname>Harrach</surname> <given-names>B.</given-names></name> <name><surname>Ganac</surname> <given-names>R. D.</given-names></name> <name><surname>Gozum</surname> <given-names>M. M. A.</given-names></name> <name><surname>Dela Cruz</surname> <given-names>W. P.</given-names></name> <name><surname>Riedel</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2007</year>). <article-title>New adenovirus species found in a patient presenting with gastroenteritis</article-title>. <source>J. Virol</source>. <volume>81</volume>, <fpage>5978</fpage>&#x02013;<lpage>5984</lpage>. <pub-id pub-id-type="doi">10.1128/JVI.02650-06</pub-id><pub-id pub-id-type="pmid">17360747</pub-id></citation>
</ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jones</surname> <given-names>P. A.</given-names></name></person-group> (<year>2012</year>). <article-title>Functions of DNA methylation: islands, start sites, gene bodies and beyond</article-title>. <source>Nat. Rev. Genet</source>. <volume>13</volume>, <fpage>484</fpage>&#x02013;<lpage>492</lpage>. <pub-id pub-id-type="doi">10.1038/nrg3230</pub-id><pub-id pub-id-type="pmid">22641018</pub-id></citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Khoury</surname> <given-names>J. D.</given-names></name> <name><surname>Tannir</surname> <given-names>N. M.</given-names></name> <name><surname>Williams</surname> <given-names>M. D.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Yao</surname> <given-names>H.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Landscape of DNA virus associations across human malignant cancers: analysis of 3,775 cases using RNA-Seq</article-title>. <source>J. Virol</source>. <volume>87</volume>, <fpage>8916</fpage>&#x02013;<lpage>8926</lpage>. <pub-id pub-id-type="doi">10.1128/JVI.00340-13</pub-id><pub-id pub-id-type="pmid">23740984</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>M.</given-names></name> <name><surname>Gans</surname> <given-names>J. D.</given-names></name> <name><surname>Nogueira</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>A.</given-names></name> <name><surname>Paik</surname> <given-names>J.-H.</given-names></name> <name><surname>Feng</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2006</year>). <article-title>Comparative oncogenomics identifies NEDD9 as a melanoma metastasis gene</article-title>. <source>Cell</source> <volume>125</volume>, <fpage>1269</fpage>&#x02013;<lpage>1281</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2006.06.008</pub-id><pub-id pub-id-type="pmid">16814714</pub-id></citation>
</ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kircher</surname> <given-names>M.</given-names></name> <name><surname>Sawyer</surname> <given-names>S.</given-names></name> <name><surname>Meyer</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform</article-title>. <source>Nucleic Acids Res</source>. <volume>40</volume>, <fpage>e3</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gkr771</pub-id><pub-id pub-id-type="pmid">22021376</pub-id></citation>
</ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kr&#x000FC;ger</surname> <given-names>D. H.</given-names></name> <name><surname>Bickle</surname> <given-names>T. A.</given-names></name></person-group> (<year>1983</year>). <article-title>Bacteriophage survival: multiple mechanisms for avoiding the deoxyribonucleic acid restriction systems of their hosts</article-title>. <source>Microbiol. Rev</source>. <volume>47</volume>, <fpage>345</fpage>&#x02013;<lpage>360</lpage>. <pub-id pub-id-type="pmid">6314109</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Langmead</surname> <given-names>B.</given-names></name> <name><surname>Trapnell</surname> <given-names>C.</given-names></name> <name><surname>Pop</surname> <given-names>M.</given-names></name> <name><surname>Salzberg</surname> <given-names>S. L.</given-names></name></person-group> (<year>2009</year>). <article-title>Ultrafast and memory-efficient alignment of short DNA sequences to the human genome</article-title>. <source>Genome Biol</source>. <volume>10</volume>:<fpage>R25</fpage>. <pub-id pub-id-type="doi">10.1186/gb-2009-10-3-r25</pub-id><pub-id pub-id-type="pmid">19261174</pub-id></citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leinonen</surname> <given-names>R.</given-names></name> <name><surname>Akhtar</surname> <given-names>R.</given-names></name> <name><surname>Birney</surname> <given-names>E.</given-names></name> <name><surname>Bower</surname> <given-names>L.</given-names></name> <name><surname>Cerdeno-T&#x000E1;rraga</surname> <given-names>A.</given-names></name> <name><surname>Cheng</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>The European nucleotide archive</article-title>. <source>Nucleic Acids Res</source>. <volume>39</volume>, <fpage>D28</fpage>&#x02013;<lpage>D31</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkq967</pub-id><pub-id pub-id-type="pmid">20972220</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>J.</given-names></name> <name><surname>Huang</surname> <given-names>Q.</given-names></name> <name><surname>Zeng</surname> <given-names>F.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name> <name><surname>He</surname> <given-names>Z.</given-names></name> <name><surname>Chen</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>The prognostic value of global DNA hypomethylation in cancer: a meta-analysis</article-title>. <source>PLoS ONE</source> <volume>9</volume>:<fpage>e106290</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0106290</pub-id><pub-id pub-id-type="pmid">25184628</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lysholm</surname> <given-names>F.</given-names></name> <name><surname>Wetterbom</surname> <given-names>A.</given-names></name> <name><surname>Lindau</surname> <given-names>C.</given-names></name> <name><surname>Darban</surname> <given-names>H.</given-names></name> <name><surname>Bjerkner</surname> <given-names>A.</given-names></name> <name><surname>Fahlander</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Characterization of the viral microbiome in patients with severe lower respiratory tract infections, using metagenomic sequencing</article-title>. <source>PLoS ONE</source> <volume>7</volume>:<fpage>e30875</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0030875</pub-id><pub-id pub-id-type="pmid">22355331</pub-id></citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marongiu</surname> <given-names>L.</given-names></name> <name><surname>Godi</surname> <given-names>A.</given-names></name> <name><surname>Parry</surname> <given-names>J. V.</given-names></name> <name><surname>Beddows</surname> <given-names>S.</given-names></name></person-group> (<year>2014</year>). <article-title>Human Papillomavirus 16, 18, 31 and 45 viral load, integration and methylation status stratified by cervical disease stage</article-title>. <source>BMC Cancer</source> <volume>14</volume>:<fpage>384</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2407-14-384</pub-id><pub-id pub-id-type="pmid">24885011</pub-id></citation>
</ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martin</surname> <given-names>D. H.</given-names></name> <name><surname>Zozaya</surname> <given-names>M.</given-names></name> <name><surname>Lillis</surname> <given-names>R.</given-names></name> <name><surname>Miller</surname> <given-names>J.</given-names></name> <name><surname>Ferris</surname> <given-names>M. J.</given-names></name></person-group> (<year>2012</year>). <article-title>The microbiota of the human genitourinary tract: trying to see the forest through the trees</article-title>. <source>Trans. Am. Clin. Climatol. Assoc</source>. <volume>123</volume>, <fpage>242</fpage>&#x02013;<lpage>256</lpage>. <pub-id pub-id-type="pmid">23303991</pub-id></citation>
</ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Niu</surname> <given-names>B.</given-names></name> <name><surname>Zhu</surname> <given-names>Z.</given-names></name> <name><surname>Fu</surname> <given-names>L.</given-names></name> <name><surname>Wu</surname> <given-names>S.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name></person-group> (<year>2011</year>). <article-title>FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes</article-title>. <source>Bioinformatics</source> <volume>27</volume>, <fpage>1704</fpage>&#x02013;<lpage>1705</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btr252</pub-id><pub-id pub-id-type="pmid">21505035</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paulson</surname> <given-names>E. J.</given-names></name> <name><surname>Speck</surname> <given-names>S. H.</given-names></name></person-group> (<year>1999</year>). <article-title>Differential methylation of epstein-barr virus latency promoters facilitates viral persistence in healthy seropositive individuals</article-title>. <source>J. Virol</source>. <volume>73</volume>, <fpage>9959</fpage>&#x02013;<lpage>9968</lpage>. <pub-id pub-id-type="pmid">10559309</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poreba</surname> <given-names>E.</given-names></name> <name><surname>Broniarczyk</surname> <given-names>J. K.</given-names></name> <name><surname>Gozdzicka-Jozefiak</surname> <given-names>A.</given-names></name></person-group> (<year>2011</year>). <article-title>Epigenetic mechanisms in virus-induced tumorigenesis</article-title>. <source>Clin. Epigenetics</source> <volume>2</volume>, <fpage>233</fpage>&#x02013;<lpage>247</lpage>. <pub-id pub-id-type="doi">10.1007/s13148-011-0026-6</pub-id><pub-id pub-id-type="pmid">22704339</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><collab>R Core Team</collab></person-group> (<year>2012</year>). <source>R: A Language and Environment for Statistical Computing</source>. <publisher-loc>Vienna</publisher-loc>: <publisher-name>R Foundation for Statistical Computing</publisher-name>.</citation>
</ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salyakina</surname> <given-names>D.</given-names></name> <name><surname>Tsinoremas</surname> <given-names>N. F.</given-names></name></person-group> (<year>2013</year>). <article-title>Viral expression associated with gastrointestinal adenocarcinomas in TCGA high-throughput sequencing data</article-title>. <source>Hum. Genomics</source> <volume>7</volume>:<fpage>23</fpage>. <pub-id pub-id-type="doi">10.1186/1479-7364-7-23</pub-id><pub-id pub-id-type="pmid">24279398</pub-id></citation>
</ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schiller</surname> <given-names>J. T.</given-names></name> <name><surname>Lowy</surname> <given-names>D. R.</given-names></name></person-group> (<year>2010</year>). <article-title>Vaccines to prevent infections by oncoviruses</article-title>. <source>Annu. Rev. Microbiol</source>. <volume>64</volume>, <fpage>23</fpage>&#x02013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.micro.112408.134019</pub-id><pub-id pub-id-type="pmid">20420520</pub-id></citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schmieder</surname> <given-names>R.</given-names></name> <name><surname>Edwards</surname> <given-names>R.</given-names></name></person-group> (<year>2011</year>). <article-title>Quality control and preprocessing of metagenomic datasets</article-title>. <source>Bioinformatics</source> <volume>27</volume>, <fpage>863</fpage>&#x02013;<lpage>864</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btr026</pub-id><pub-id pub-id-type="pmid">21278185</pub-id></citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serre</surname> <given-names>D.</given-names></name> <name><surname>Lee</surname> <given-names>B. H.</given-names></name> <name><surname>Ting</surname> <given-names>A. H.</given-names></name></person-group> (<year>2010</year>). <article-title>MBD-isolated genome sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome</article-title>. <source>Nucleic Acids Res</source>. <volume>38</volume>, <fpage>391</fpage>&#x02013;<lpage>399</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkp992</pub-id><pub-id pub-id-type="pmid">19906696</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smiraglia</surname> <given-names>D. J.</given-names></name></person-group> (<year>2001</year>). <article-title>Excessive CpG island hypermethylation in cancer cell lines versus primary human malignancies</article-title>. <source>Hum. Mol. Genet</source>. <volume>10</volume>, <fpage>1413</fpage>&#x02013;<lpage>1419</lpage>. <pub-id pub-id-type="doi">10.1093/hmg/10.13.1413</pub-id><pub-id pub-id-type="pmid">11440994</pub-id></citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steenbergen</surname> <given-names>R. D.</given-names></name> <name><surname>Walboomers</surname> <given-names>J. M.</given-names></name> <name><surname>Meijer</surname> <given-names>C. J.</given-names></name> <name><surname>van der Raaij-Helmer</surname> <given-names>E. M.</given-names></name> <name><surname>Parker</surname> <given-names>J. N.</given-names></name> <name><surname>Chow</surname> <given-names>L. T.</given-names></name> <etal/></person-group>. (<year>1996</year>). <article-title>Transition of human papillomavirus type 16 and 18 transfected human foreskin keratinocytes towards immortality: activation of telomerase and allele losses at 3p, 10p, 11q and/or 18q</article-title>. <source>Oncogene</source> <volume>13</volume>, <fpage>1249</fpage>&#x02013;<lpage>1257</lpage>. <pub-id pub-id-type="pmid">8808699</pub-id></citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steenbergen</surname> <given-names>R. D. M.</given-names></name> <name><surname>Ongenaert</surname> <given-names>M.</given-names></name> <name><surname>Snellenberg</surname> <given-names>S.</given-names></name> <name><surname>Trooskens</surname> <given-names>G.</given-names></name> <name><surname>van der Meide</surname> <given-names>W. F.</given-names></name> <name><surname>Pandey</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Methylation-specific digital karyotyping of HPV16E6E7-expressing human keratinocytes identifies novel methylation events in cervical carcinogenesis</article-title>. <source>J. Pathol</source>. <volume>231</volume>, <fpage>53</fpage>&#x02013;<lpage>62</lpage>. <pub-id pub-id-type="doi">10.1002/path.4210</pub-id><pub-id pub-id-type="pmid">23674368</pub-id></citation>
</ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steenbergen</surname> <given-names>R. D. M.</given-names></name> <name><surname>Snijders</surname> <given-names>P. J. F.</given-names></name> <name><surname>Heideman</surname> <given-names>D. A. M.</given-names></name> <name><surname>Meijer</surname> <given-names>C. J. L. M.</given-names></name></person-group> (<year>2014</year>). <article-title>Clinical implications of (epi)genetic changes in HPV-induced cervical precancerous lesions</article-title>. <source>Nat. Rev. Cancer</source> <volume>14</volume>, <fpage>395</fpage>&#x02013;<lpage>405</lpage>. <pub-id pub-id-type="doi">10.1038/nrc3728</pub-id><pub-id pub-id-type="pmid">24854082</pub-id></citation>
</ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szostek</surname> <given-names>S.</given-names></name> <name><surname>Zawilinska</surname> <given-names>B.</given-names></name> <name><surname>Kopec</surname> <given-names>J.</given-names></name> <name><surname>Kosz-Vnenchak</surname> <given-names>M.</given-names></name></person-group> (<year>2009</year>). <article-title>Herpesviruses as possible cofactors in HPV-16-related oncogenesis</article-title>. <source>Acta Biochim. Pol</source>. <volume>56</volume>, <fpage>337</fpage>&#x02013;<lpage>342</lpage>. <pub-id pub-id-type="pmid">19499088</pub-id></citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tjon Pian Gi</surname> <given-names>R. E. A.</given-names></name> <name><surname>San Giorgi</surname> <given-names>M. R. M.</given-names></name> <name><surname>Slagter-Menkema</surname> <given-names>L.</given-names></name> <name><surname>van Hemel</surname> <given-names>B. M.</given-names></name> <name><surname>van der Laan</surname> <given-names>B. F. A. M.</given-names></name> <name><surname>van den Heuvel</surname> <given-names>E. R.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>The clinical course of recurrent respiratory papillomatosis: a comparison between aggressiveness of HPV6 and HPV11</article-title>. <source>Head Neck</source>. [Epub ahead of print]. <pub-id pub-id-type="doi">10.1002/hed.23808</pub-id><pub-id pub-id-type="pmid">24955561</pub-id></citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van den Brule</surname> <given-names>A. J. C.</given-names></name> <name><surname>Pol</surname> <given-names>R.</given-names></name> <name><surname>Fransen-Daalmeijer</surname> <given-names>N.</given-names></name> <name><surname>Schouls</surname> <given-names>L. M.</given-names></name> <name><surname>Meijer</surname> <given-names>C. J. L. M.</given-names></name> <name><surname>Snijders</surname> <given-names>P. J. F.</given-names></name></person-group> (<year>2002</year>). <article-title>GP5&#x0002B;/6&#x0002B; PCR followed by reverse line blot analysis enables rapid and high-throughput identification of human papillomavirus genotypes</article-title>. <source>J. Clin. Microbiol</source>. <volume>40</volume>, <fpage>779</fpage>&#x02013;<lpage>787</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.40.3.779-787.2002</pub-id><pub-id pub-id-type="pmid">11880393</pub-id></citation>
</ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Varley</surname> <given-names>K. E.</given-names></name> <name><surname>Gertz</surname> <given-names>J.</given-names></name> <name><surname>Bowling</surname> <given-names>K. M.</given-names></name> <name><surname>Parker</surname> <given-names>S. L.</given-names></name> <name><surname>Reddy</surname> <given-names>T. E.</given-names></name> <name><surname>Pauli-Behn</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Dynamic DNA methylation across diverse human cell lines and tissues</article-title>. <source>Genome Res</source>. <volume>23</volume>, <fpage>555</fpage>&#x02013;<lpage>567</lpage>. <pub-id pub-id-type="doi">10.1101/gr.147942.112</pub-id><pub-id pub-id-type="pmid">23325432</pub-id></citation>
</ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Walboomers</surname> <given-names>J. M.</given-names></name> <name><surname>Jacobs</surname> <given-names>M. V.</given-names></name> <name><surname>Manos</surname> <given-names>M. M.</given-names></name> <name><surname>Bosch</surname> <given-names>F. X.</given-names></name> <name><surname>Kummer</surname> <given-names>J. A.</given-names></name> <name><surname>Shah</surname> <given-names>K. V.</given-names></name> <etal/></person-group>. (<year>1999</year>). <article-title>Human papillomavirus is a necessary cause of invasive cervical cancer worldwide</article-title>. <source>J. Pathol</source>. <volume>189</volume>, <fpage>12</fpage>&#x02013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.1002/(SICI)1096-9896(199909)189:1&#x0003C;12::AID-PATH431&#x0003E;3.0.CO;2-F</pub-id><pub-id pub-id-type="pmid">10451482</pub-id></citation>
</ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wheeler</surname> <given-names>D. L.</given-names></name> <name><surname>Barrett</surname> <given-names>T.</given-names></name> <name><surname>Benson</surname> <given-names>D. A.</given-names></name> <name><surname>Bryant</surname> <given-names>S. H.</given-names></name> <name><surname>Canese</surname> <given-names>K.</given-names></name> <name><surname>Chetvernin</surname> <given-names>V.</given-names></name> <etal/></person-group>. (<year>2006</year>). <article-title>Database resources of the national center for biotechnology information</article-title>. <source>Nucleic Acids Res</source>. <volume>34</volume>, <fpage>D173</fpage>&#x02013;<lpage>D180</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkj158</pub-id><pub-id pub-id-type="pmid">16381840</pub-id></citation>
</ref>
<ref id="B55">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wickham</surname> <given-names>H.</given-names></name></person-group> (<year>2009</year>). <source>ggplot2: Elegant Graphics for Data Analysis</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer</publisher-name>.</citation>
</ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Willner</surname> <given-names>D.</given-names></name> <name><surname>Hugenholtz</surname> <given-names>P.</given-names></name></person-group> (<year>2013</year>). <article-title>From deep sequencing to viral tagging: recent advances in viral metagenomics</article-title>. <source>Bioessays</source> <volume>35</volume>, <fpage>436</fpage>&#x02013;<lpage>442</lpage>. <pub-id pub-id-type="doi">10.1002/bies.201200174</pub-id><pub-id pub-id-type="pmid">23450659</pub-id></citation>
</ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wisman</surname> <given-names>G. B. A.</given-names></name> <name><surname>Nijhuis</surname> <given-names>E. R.</given-names></name> <name><surname>Hoque</surname> <given-names>M. O.</given-names></name> <name><surname>Reesink-Peters</surname> <given-names>N.</given-names></name> <name><surname>Koning</surname> <given-names>A. J.</given-names></name> <name><surname>Volders</surname> <given-names>H. H.</given-names></name> <etal/></person-group>. (<year>2006</year>). <article-title>Assessment of gene promoter hypermethylation for detection of cervical neoplasia</article-title>. <source>Int. J. Cancer</source> <volume>119</volume>, <fpage>1908</fpage>&#x02013;<lpage>1914</lpage>. <pub-id pub-id-type="doi">10.1002/ijc.22060</pub-id><pub-id pub-id-type="pmid">16736496</pub-id></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yozwiak</surname> <given-names>N. L.</given-names></name> <name><surname>Skewes-Cox</surname> <given-names>P.</given-names></name> <name><surname>Stenglein</surname> <given-names>M. D.</given-names></name> <name><surname>Balmaseda</surname> <given-names>A.</given-names></name> <name><surname>Harris</surname> <given-names>E.</given-names></name> <name><surname>DeRisi</surname> <given-names>J. L.</given-names></name></person-group> (<year>2012</year>). <article-title>Virus identification in unknown tropical febrile illness cases using deep sequencing</article-title>. <source>PLoS Negl. Trop. Dis</source>. <volume>6</volume>:<fpage>e1485</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pntd.0001485</pub-id><pub-id pub-id-type="pmid">22347512</pub-id></citation>
</ref>
<ref id="B59">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>zur Hausen</surname> <given-names>H.</given-names></name></person-group> (<year>2006</year>). <source>Infections Causing Human Cancer</source>. <publisher-loc>Weinheim</publisher-loc>: <publisher-name>Wiley-VCH</publisher-name>.</citation>
</ref>
</ref-list>
</back>
</article>