METHODS article

Front. Microbiol., 28 September 2018

Sec. Virology

Volume 9 - 2018 | https://doi.org/10.3389/fmicb.2018.02339

Whole Genome Sequencing of Enteroviruses Species A to D by High-Throughput Sequencing: Application for Viral Mixtures

  • 1. Unité de Biologie des Virus Entériques, Institut Pasteur, Paris, France

  • 2. Institut National de la Santé et de la Recherche Médicale, Paris, France

  • 3. WHO Collaborating Center for Research on Enteroviruses and Viral Vaccines, Institut Pasteur, Paris, France

  • 4. Unité de Virologie, Institut Pasteur de Madagascar, Antananarivo, Madagascar

Article metrics

View details

44

Citations

10,8k

Views

2,8k

Downloads

Abstract

Human enteroviruses (EV) consist of more than 100 serotypes classified within four species for enteroviruses (EV-A to -D) and three species for rhinoviruses, which have been implicated in a variety of human illnesses. Being able to simultaneously amplify the whole genome and identify enteroviruses in samples is important for studying the viral diversity in different geographical regions and populations. It also provides knowledge about the evolution of these viruses. Therefore, we developed a rapid, sensitive method to detect and genetically classify all human enteroviruses in mixtures. Strains of EV-A (15), EV-B (40), EV-C (20), and EV-D (2) viruses were used in addition to 20 supernatants from RD cells infected with stool extracts or sewage concentrates. Two overlapping fragments were produced using a newly designed degenerated primer targeting the conserved CRE region for enteroviruses A-D and one degenerated primer set designed to specifically target the conserved region for each enterovirus species (EV-A to -D). This method was capable of sequencing the full genome for all viruses except two, for which nearly 90% of the genome was sequenced. This method also demonstrated the ability to discriminate, in both spiked and unspiked mixtures, the different enterovirus types present.

Introduction

The common human viruses, human enteroviruses (EV), consist of more than 100 serotypes most classified within four species for enteroviruses (EV-A to -D) and three species for rhinoviruses. Enteroviruses are ubiquitous and resilient in the environment and primarily transmitted fecal-orally. Human enteroviruses have been implicated in a variety of human diseases, including the common cold, hand foot and mouth disease, acute hemorrhagic conjunctivitis, myocarditis, encephalitis, and poliomyelitis.

Enteroviruses are non-enveloped viruses, approximately 7500 nucleotides (nt) in length with a positive, single-stranded RNA genome. There are two untranslated regions (5′ and 3′ -UTR) flanking a large open reading frame encoding a polyprotein that is cleaved to give three precursors P1 to P3. These precursors are subsequently cleaved to give functional proteins: (1) P1 giving rise to four structural capsid proteins (VP1–VP4) and (2) P2 and P3, the non-structural proteins involved in the virus life cycle.

Studies have been conducted to best determine the most accurate way to identify and classify enteroviruses, giving rise to some “gold standards” for detection and typing. Typically, enteroviruses have been isolated on a variety of cell lines (e.g., RD, GMK, Vero, CaCo-2, L20B, HEp2c, and HeLa) based on their ability to propagate and show cytopathic effect (CPE), followed by serotyping using neutralization assays (Silva et al., 2008; Poh and Tan, 2010; Blomqvist and Roivainen, 2016). Despite cell amplification being an appropriate method, it is laborious, time-consuming, and expensive (Nijhuis et al., 2002). It has been shown that there is a good correlation between the sequences of the VP1 nucleotidic and amino acid sequences and enterovirus serotypes (Oberste et al., 1999a; Caro et al., 2001). Thus, VP1 sequences have been used as a gold standard for typing and, if relevant, subtyping enteroviruses (Oberste et al., 1999a,b; Nix et al., 2006; Poh and Tan, 2010).

In addition to typing, studies dedicated to evaluating the enterovirus genomic diversity have shown that besides mutations, intra- and inter-typic genetic recombination is also a frequent mechanism of viral evolution (Lukashev et al., 2003; Simmonds, 2006; Combelas et al., 2011; Joffret et al., 2012). Enterovirus genomes frequently display mosaicism due to genetic exchanges among different enterovirus strains and types. Since this kind of genetic diversification can be at the source of genotypic and phenotypic diversity (Sadeuh-Mba et al., 2013; Bessaud et al., 2016) it is essential to determine the whole genomic sequences of enteroviruses for surveillance and public health purposes, as well as for basic research.

For obtaining sequencing data, the traditional Sanger method is capable of sequencing the whole genome but it is time-consuming and it cannot simultaneously sequence a mixture of viruses; thereby, making a large-scale surveillance project difficult to conduct due to the presence of many viruses. Furthermore, it makes it challenging to specifically target every known enterovirus and impossible to identify any unknown viruses.

Next generation sequencing (NGS) methods offer a new powerful sequencing tool for the identification and characterization of enteroviruses. This has been successfully used to sequence partial or whole genome sequences of poliovirus and of enteroviruses species C (Bessaud et al., 2016; Montmayeur et al., 2017; Sahoo et al., 2017). Additionally, a generic assay for whole genome amplification and deep sequencing of enterovirus A71 was published (Tan et al., 2015). Despite advances in molecular methods, none of these assays were designed to simultaneously amplify the whole genome for all four enterovirus species, and it is known that mixtures of enteroviruses can be found in human stool or sewage. Hence, the goal of this research was to design generic and species-specific primers in order to develop an assay capable of viral typing and sequencing the whole genome for all enteroviruses of species A to D present in samples containing viral mixtures.

Materials and Methods

Viruses

For this study 15-EVA, 40-EVB, 20-EVC, and 2-EVD viruses were used (Table 1) to test and analyze whole-genome sequencing. The prototype strains selected were from the European Virus Archive (EVAg)1. Additionally, we used 20 enterovirus positive supernatants from human rhabdomyosarcoma cells (RD cells) from the Madagascar National Polio Laboratory. These cells were infected with either 10 human stool extracts or 10 sewage concentrates collected during poliomyelitis surveillance in Madagascar and prepared according to WHO protocols (World Health Organization Polio Laboratory manual, 4th edition, 2004). Infections of RD cells with these extracts were followed by full cytopathogenic effect, resembling that induced by enteroviruses.

Table 1

TypeStrain or isolaten° DatabankConsensus lengthTypeStrain or isolaten° DatabankConsensus length
SPECIES ASPECIES B continued
CV-A2FleetwoodAY4217607411E-11RO-91-91AJ5775947460
CV-A3OlsonAY4217617411E-12RO-78-3-74LS4512987351
CV-A6GdulaAY4217647551E-13DelCarmenAY3025397410
CV-A6MAD-2628-11LT7190477503E-14TowAY3025407352
CV-A7ParkerAY4217657395E-14RO-81-1-79LS4512997457
CV-A10KowalikAY4217677420E-15CH-96-51AY3025417364
CV-A10MAD-9856-11LT7190597402E-20JV-1AY3025467321
CV-A10MAD-3995-11LT7190567415E-21FarinaAY3025477000
CV-A12Texas-12AY4217687623E-24DeCampAY3025487239
CV-A14G14AY4217697540E-25JV-4AY3025497443
CV-A14MAD-2718-11LT7190627521E-26CoronelAY3025507459
CV-A16G10U058767000E-27BaconAY3025517339
EV-A71MAD-3126-11LT7190637400E-29JV-10AY3025527362
EV-A71MAD-72341-04LT7190657326E-30BastianniAY3119387440
EV-A71CAE-146-08LT7190667511E-32PR-10AY3025557448
SPECIES BEV-B69Toluca-1AY3025607338
CV-A9RO-609-4-80LS4512857398SPECIES C
CV-B1RO-98-1-74LS4512867410CV-A1TompkinsAF4996357351
CV-B2OhioAF0814857350CV-A11BelgiumAF4996367326
CV-B3RO-123-1-95LS4512877519CV-A11MAD-66122JF2609177296
CV-B3RO-69-1-89LS4512887327CV-A11MAD-66990JF2609187000
CV-B4E2AF3119397173CV-A13FloresAF4655117368
CV-B4RO-69-1-86LS4512897351CV-A13G13AF4996407380
CV-B5RO-14-5-70LS4512907273CV-A13MAD-67001JF2609207298
CV-B5FaulknerAF1143837404CV-A13MAD-67900JF2609217000
CV-B6SchmittAF0392057209CV-A17G12AF4996397301
CV-B6RO-86-1-73LS4512917345CV-A17MAD-67610JF2609247368
E-1FaroukAF0298597313CV-A17MAD-68154JF2609257541
E-1RO-122-1-74LS4512927351CV-A198663AF4996417449
E-3MorrisseyAY3025537366CV-A20IH35AF4996427415
E-4PesacekAY3025577359CV-A20aTulaneX876017305
E-5NoyceAF0830697340CV-A20bCecilX876027360
E-5RO-79-2-71LS4512937444CV-A21CoeD005387393
E-6D’AmoriAY3025587436CV-A24DOU-054JX4178737366
E-6RO-24-9-79LS4512947421EV-C95T08-083JX4178226895
E-7RO-434-2-81LS4512957355EV-C99MAD-69412-03LS4513007438
E-7RO-141-2-95LS4512967386EV-C99MAD-69558-03LS4513017383
E-9HillX849817379SPECIES D
E-9RO-116-6-82LS4512977369EV-D68FermonAY4265317369
E-11GregoryX800597564EV-D70J670/71D008206467

Viruses used in this study.

For the collection of stool from Madagascar, the protocol was approved by the National Ethics Committee of the Ministry of Public Health in Madagascar (Agreement N° 22-MSANP/CE). Parental written informed consent was given.

Primer Design

For each enterovirus group, complete genome sequences were separately aligned using CLC Main Workbench. To achieve the two overlapping fragments, the first half of the genome was amplified using the previously described C004 primer (Bessaud et al., 2016) in combination with a newly designed degenerated primer targeting the conserved CRE (cis-acting replication element) region for enteroviruses A-D. The second half of the genome was amplified using one degenerated primer set designed to specifically target the conserved region for each enterovirus species. Therefore, we performed five PCR mixes for each isolate and we pooled the PCR products for sequencing. All primers used in this study are described in Table 2 and Figure 1 illustrates the primer locations.

Table 2

Primer setEnterovirus speciesPrimer name5′–3′ SequenceGenome position1
1A, B, C, and DC0042TTAAAACAGCYYKDGGGTTG1–20
A, B, C, and DEV-CRE-RCGGBRTTTGSWCTTGAACTG∼4500
2AEVA-4110-FAARAARTTYAAYGAYATGGC4110–4130
AEVA-7410-RTTTGCTATTCTGGTTATAAC7410–7390
BEVB-4110-FGGCGNTGGCTYAARCRAARG4110–4130
BEVB-7400-RGCACCGAATGCGGAGAATTTAC7400–7378
CEVC-4220-FGARGCNTGYAAYGCNGCNAARG4220–4242
CC0052CCGAATYAAARRAAAATTTACCC7437–7415
DEVD-4112-FGGCTAKCMCAAAAGATWGAC4112–4132
DEVD-7362-RCCAAKTRACCAAAATTTACC7362–7342

Primers used for this study.

1The specific genome position. Position may vary depending on the strain of the virus. 2Primer described in Bessaud et al. (2016).

FIGURE 1

FIGURE 1

Location of primers used in this study. The diagram shows the organization of the enterovirus genome. Arrows indicate the sites targeted by the proposed primers. The RT-PCR products are colored coded to represent the corresponding primer pair. The first half of the genome (fragment 1) is amplified using one primer pair whereas the second half (fragment 2) is amplified using primers targeting each enterovirus species.

RNA Extraction

Viral RNA was extracted from supernatants of infected cells, using the High Pure Viral RNA kit (Roche Diagnostics, Meylan, France) per manufacturer’s protocol. All RNA was either immediately used for PCR amplification or stored at -80°C for further analysis.

One-Step RT-PCR Amplification

For entire length genome amplification, we used two primer sets (Table 2) per enterovirus group (A, B, C, and D). This allowed us to synthesize the two overlapping amplicons by using the One-Step RT-PCR kit (ref G174 Applied Biological Materials Inc.). The reaction mixture contained 1.5 μl of purified RNA, 12.5 μl of 2× One-step RT-PCR buffer, 1 μl of each forward and reverse primer (20 μM), 0.5 μl of EasyScript RTase (200 U/μl), 1 μl of Bestaq DNA Polymerase (5 U/μl), and 9 μl of DNAse free water. PCR amplification was performed using a thermocycler with the following protocol: 42°C for 30 min, 94°C for 3 min, 35 cycles at 94°C for 30 s, 55°C for 30 s, 72°C for 4 min 30 s, and a last step at 72°C for 10 min, ending with 2 min at 4°C. All PCR products were visualized on ethidium bromide-stained agarose gels to ensure appropriate size products.

The sensitivity of this assay was evaluated using serial fourfold dilutions of four viruses (EV-A71 strain MAD-72341, CV-B4 strain E2, CV-A13 strain MAD 67001, and EV-D 68 strain Fermon) representing each species (Table 3). To maintain similar amounts of cellular nucleic acids throughout dilutions, viral stocks were diluted using a supernatant of confluent non-infected HEp-2c cell monolayers frozen and thawed twice and clarified by centrifugation.

Table 3

Species (reference strain)1Dilution factorViral titerrRT-PCR cycle threshold2DNA concentration (ng/μL)3Number of reads4Number of reads mapping against the reference (%)5Contig length6
EV-A (EV-A71 MAD-72341-04)406.318.450.6437 9224343607 336
10e6(99.2%)
411.621.531.8483 4904711387 410
10e6(97.5%)
423.923.120.1640 5225657317 470
10e5(88.3%)
439.830.211.7478 9403571687 405
10e4(74.6%)
442.531.110.2270 7901097317 405
10e4(40.5%)
456.229.69.5500 760853447 405
10e3(17.0%)
461.532.610.6349 396240727 408
10e3(6.9%)
EV-B (CV-B4 E2)404.021.358.1526 4744575297 442
10e7(86.9%)
411.025.336.5529 6644757257 430
10e7(89.8%)
422.528.322.7657 4485393097 450
10e6(82.0%)
436.329.315.7365 8202228817 410
10e5(60.9%)
441.633.515.14004111276987 434
10e5(31.9%)
453.9Undeter717.651744996564 294
10e4(2.0%)
469.8Undeter10.4338 67023756 872
10e3(0.7%)
EV-C (CV-A13 MAD 67001)401.018.629.3538 236502997 450
10e8(93.5%)
412.522.312.3551 9624622877 374
10e7(83.8%)
426.323.710.4369 9042559707 381
10e6(69.2%)
431.626.59.4418 630600567 442
10e6(14.4%)
443.928.710.8383 840154143 279
10e5(4.0%)
459.831.210.8454272104923 371
10e4(2.5%)
462.433.68.8467 8089823 262
10e4(0.2%)
EV-D (EV-68 Fermon)401.324.323.8468 5144651077 367
10e6(99.3%)
413.328.214.1585 6464948467 359
10e5(84.5%)
428.132.014.2496 2924088547 282
10e4(82.4%)
432.035.68.3338 9061781907 277
10e4(52.6%)
445.1Undeter11.7547 836799367 359
10e3(14.6%)
451.3Undeter7.3555 258674396 670
10e3(12.2%)
463.2Undeter15.1580 342112543 398
10e2(1.9%)

Sensitivity of the NGS assay.

1The enterovirus species and strain for viruses used in this assay. 2Cycle threshold values obtained using a pan-enterovirus real-time RT-PCR assay performed on the extracted RNA. 3Concentration of DNA amplicons obtained by RT-PCR for each dilution. 4Total number of reads from NGS sequencing. 5Number of reads mapped against the known reference (indicated in column one) for each viral dilution with percent identity to total reads. 6The size of each contig (enteroviruses ∼7500 nb). 7There was no amplification therefore no threshold was obtained.

This RT-PCR amplification method was developed for the analysis of certain amount of viruses present following amplification in infected cells. It could probably be applied to the direct analysis of human or environmental samples, including different compartment-specific human fluids, provided that the sensitivity of the method has been adapted to the amount of virus present in these samples.

Virus Mixture Detection

To confirm the ability to detect viruses in a mixture, different samples containing known viral isolates under the following conditions were prepared: (1) equal amounts of four viral isolates representing enterovirus species A, B, C, and D and (2) four viral isolates belonging to enterovirus species B. The viral isolates used to perform these mixture experiments are listed in Table 4.

Table 4

Mixture1Contig name2Contig length3Annotation (virus)4Percent identity5VP16VP1 length7
1contig 36620Coxsackievirus A10, strain Kowalik99.9full894
contig 17435Echovirus 11, isolate ROU-919199.9full876
contig 84681Coxsackievirus A13, isolate 6700199.8partial343
contig 27318Enterovirus 68, strain Fermon99.9partial536
2contig 14444Coxsackievirus B6, strain Schmitt99.7full846
contig 24416Echovirus 5, strain Noyce99.9full876
contig 34408Echovirus 7, strain Wallace99.9full876
contig 44384Coxsackievirus B3, strain Nancy99.9full852
3contig 17416Echovirus 5, strain Noyce99.9full876
contig 27404Echovirus 7, strain Wallace99.9full876
contig 37383Coxsackievirus B6, strain Schmitt99.9full846
contig 47376Coxsackievirus B3, strain Nancy99.9full852

NGS analysis in three mixtures containing four viruses.

1Mixture 1 was performed using four viruses (one strain from each species) with equal quantities of RNA. Mixture 2 and 3 were performed using four EV-B viruses. For mixture 2, PCR was performed using primers designed to amplify the first half of the genome, whereas mixture 3 was the PCR for the full genome. 2The identification of the contig that is provided in the NGS analysis. 3The length of the contig in nucleotides (∼7500 nt = enterovirus size). 4The name given for the virus in GeneBank (using BLAST in NCBI). 5The percent identity between the contig sequence and the nearest sequence of known virus in GeneBank. 6If the VP1 was retrieved in full or partial size. 7The length of the VP1 recovered.

To confirm the ability to detect a mixture of viruses from “real life” conditions, we used 20 supernatants from RD cells which were infected with 10 stool extracts and 10 sewage concentrates, respectively (Table 5). RNA extraction, RT-PCR, and NGS sequencing were performed as described in the related sections.

Table 5

Sample1Enterovirus contig(s)2VP1 contig(s)3Contig(s) with VP1(nt)4Read Count5Viral type(s)6Enterovirus species7
Stool 1516886450016E-21EV-B
Stool 2117475557642E-4EV-B
Stool 3327230452063E-11EV-B
2069842EV-C99EV-C
Stool 4226948371680E-2EV-B
693272007CV-A4EV-A
Stool 5317247437921E-14EV-B
Stool 6227426363551E-5EV-B
517031239EV-C99EV-C
Stool 7527411523337E-13EV-B
66055567E-20EV-B
Stool 8427109605697EV-B84EV-B
179175CV-A4EV-A
Stool 91127581413225E-15EV-B
1941359CV-A13pEV-C
Stool 10427109339135E-14EV-B
527441917EV-C99EV-C
SEW 1327337435585E-6EV-B
50951657E-13EV-B
SEW 2224444297319E-19EV-B
44048236E-7EV-B
SEW 3957403102527E-20EV-B
7380141133E-12EV-B
7371290910E-7EV-B
409411083E-6EV-B
3758390EV-A76EV-A
SEW 453457463109E-7EV-B
387113766CV-B5EV-B
7363390728E-33EV-B
SEW 5934476333848E-11EV-B
3302107812E-19EV-B
26074608E-12EV-B
SEW 683440081721E-6EV-B
4367359419CV-B5EV-B
20732E-6pEV-B
SEW 72025980409716E-6EV-B
26631321E-12EV-B
SEW 8326949303796E-12EV-B
685817635E-24EV-B
SEW 922441056755E-11EV-B
5174144696E-6EV-B
SEW 1028738384963E-12EV-B
615513411E-6EV-B
6176172800E-6EV-B
4228182570E-11EV-B
683100E-6pEV-B
1266227E-6pEV-B
33801060E-33EV-B

Detection of viral sequences in supernatants of RD cells infected with stool and sewage samples, using de novo assembly.

1Type of isolates: RD cells infected by stool extracts or sewage (SEW) concentrates. 2Number of contigs identified per sample. 3Number of contigs that contained the VP1 region (needed to identify the virus). 4Size of the contig with the VP1 region included. Size varies but we are interested in the ones closest to the size of the whole genome (∼7500 nb). 5Number of reads used to generate the VP1 contigs. 6Virus serotype (identity) using the VP1 region. Note: p indicates partial size of the VP1 based on the length. In SEW 6 and -10 different E-6 genotypes were found. 7Enterovirus species classification based on the virus identified.

PCR Purification and Next Generation Sequencing (NGS)

For each sample, the five PCR products were pooled and purified using a vacuum method and then sent to the sequencing platform PIBNET (Pasteur International Bioresources Network, Institute Pasteur Paris). The libraries were created using 1 ng of DNA with the Nextera XT DNA Library Preparation Kit in a SureCycler 8800 thermocycler (Agilent). Following purification on AMPure beads (Beckman), the libraries were controlled using the High Sensitivity D1000 assay (Agilent) on a TapeStation 2200. The products were sequenced using Illumina NextSeq HiSeq. All kits were used following manufacturer’s instructions.

Data Analysis

Using CLC Genomics Workbench 8.5 (CLCbio), we paired and assembled contigs from the raw reads. Next, de novo assembly was performed using CLC Main Workbench (CLCbio) with the following parameters: Mismatch cost = 2; Insertion cost = 2; Deletion cost = 2; Length Fraction = 0.5; Similarity Fraction = 0.95. All contigs longer than 200 nt were submitted to National Center for Biotechnology Information (NCBI) for BLAST analysis.

Results

Sequence Analysis and Primer Design

To sequence all EVs, we designed primers targeting conserved genomic regions that allowed the synthesis of overlapping amplicons. To amplify the first half of the genome primer C004 (Bessaud et al., 2016) was used in combination with a newly designed generic primer (EV-CRE-R), targeting the CRE region. This primer set was used to amplify the first half of the genome for the four EV species (EV A-D). To ensure the best amplification of the second half of the genome, we designed primers specifically targeting species A, B, C, and D. The combination of the two primer sets (Table 2), allowed for the amplification of the 5′ and 3′ parts of the genomes, which led to the synthesis of two overlapping DNA fragments per virus (approximately 400-nt long see Figure 1). This method resulted in the amplification of the whole genome for samples containing a single virus or mixture and the amplicon products were used for NGS.

To validate the proposed primer sets capability of generating whole genome sequencing data, we tested 15-EVA, 40-EVB, 20-EVC, and 2-EVD viruses (Table 1). We obtained the full genome sequencing data for all viruses except two (EV-C95 T08-083 and EV-D70), for which we obtained 93.6 and 87.5% of the genomic sequences, respectively. In all cases the sequences of the VP1 capsid protein could be used to confirm the type of virus. These results indicated the effectiveness of the proposed primers in conjunction with NGS sequencing to correctly identify the viruses and to reconstruct the whole genome using de novo assembly.

Additionally, to evaluate the sensitivity of this assay, we tested a fourfold serial dilution of one representative of each enterovirus species (Table 3). For undiluted viruses, the number of reads that mapped against the reference strains ranged between 86.9 to 99.3%, depending on the species. The full-length genome was reconstructed, using de novo assembly, for the majority of viruses and dilutions (ranging from 1.0 10e8 to 1.5 10e3 TCID50/ml). For some strains, long contigs overlapping almost the whole genomic sequences were recovered when high amounts of viruses were used (1.6 10e5 for CV-B4 and 1.6 10e6 for CV-A13).

Mixture Detection

To assess the capacity of the method to detect and identify different enteroviruses in samples containing mixtures, we performed two types of experiments: (1) mixed samples containing different mixtures of known viral isolates under controlled conditions (spiked) and (2) supernatants of cells infected with stool or sewage extracts in which the mixture of viruses was unknown and the conditions were not controlled (unspiked).

Spiked Experiments

To simulate true clinical and/or environmental conditions, four representative viral isolates for each enterovirus species were used to make mixture 1 (similar quantities or viral RNA); and four viral isolates of species B were used for mixtures 2 and 3. For all mixtures, RNA was extracted and DNA amplification was performed using the newly designed primer sets.

For mixture 1, we were successful in generating one contig per virus with 99.9% identity with the corresponding sequences in GeneBank using BLAST algorithm (Table 4). We were able to recover data about the four viral isolates and to consider the VP1 region used to identify the virus(es) present, either in its full or partial length.

For mixtures 2 and 3 we were able to correctly isolate and identify all four serotypes from species B with 99.7–99.9% identity and we recovered the full length of the VP1 region for the four B viral isolate strains under both conditions (half and full genome amplification). Additionally, for all three experiments, we confirmed that the parameters used by de novo could assemble the reads into separate contigs, resulting in the expected viruses.

Unspiked Experiments

To validate this method for field studies, we tested unknown mixtures of viral isolates present in the supernatant of RD cells infected with extracts from stools and sewage samples. Ten samples from stools and ten samples from sewage were used, following the protocols described in the materials and methods section. Full genomic sequences were identified for 8 viruses present in RD supernatants from stool samples and 5 viruses present in those from sewage samples. However, for certain viruses, contigs covering only partial genomic sequences could be obtained. In these cases, we could use the VP1 contigs to identify the viruses present. In cell supernatants of infected cells, we were able to recover 1 to 2 viruses per stool sample and 2 to 7 viruses per sewage sample (Table 5). Among twenty samples, we were able to find 48 enteroviruses based on the VP1 region. For 21 viruses, more than 6100 nt per genome could be determined, whereas, for 27 other viruses, genomic contigs were not as long. The genome of 18 viruses was higher than 90% in length. Overall, 41 EV-Bs, 3 EV-As, 3 EV-Cs, and no EV-Ds were identified.

Discussion

The goal of this research was to genetically sequence the entire genome for all enteroviruses (EV-A, -B, -C, and –D) present in a given sample, using a simple method for detection and genetic characterization. To accomplish this, we developed an RT-PCR assay where we designed degenerate primers targetting conserved regions of the genome, which allowed the DNA fragments to be amplified, followed by NGS. This paper described our approach and demonstrated the feasibility of our methods to successfully identify EV in mixtures.

The effectiveness of this assay was achieved by selecting two primer sets per enterovirus species that generated two overlapping fragments decreasing labor, time, and cost per sample tested. The first generic primer set (C004-F and CRE-R) corresponds to two conserved regions for all enteroviruses (-A through -D) and was capable of successfully amplifying the first half of the genome (∼4500 nt). The second half of the genome was amplified using generic primers that were designed to specifically target all viruses within the respective species (EV-A, -B, -C, and -D). We were able to amplify nearly the entire genome for all enteroviruses, using only five PCR mixtures. The sensitivity of the primers was determined and the EV-A primers appeared to be more sensitive than the others (Table 3). However, the primers selected for this project were sensitive enough to detect all enteroviruses and specific enough to detect only one species type in the presence of the other species.

To validate our sequencing method, we performed it using twenty RD isolate samples from human stools and sewage concentrates. The consensus lengths after de novo assembly were obtained for the ten isolate samples from stool tested, indicating that the method was sensitive enough to determine the entire genome. The results were different for sewage because mixtures of strains appeared to be more complex (up to 7 different strains) than those present in stools (two strains maximum). Only three of the ten isolate samples resulted in sequencing data for the whole genome. However, the sequencing data for the other seven provided long genomic contigs and the coverage necessary to identify the type of virus(es) present using VP1 sequences (Nix et al., 2006). Additionally, we effectively recovered the four viruses used to produce the known mixtures described in Table 4 and the enterovirus field mixtures from Table 5. Being able to differentiate viruses in mixtures is important and in this study, and in a previous study (Bessaud et al., 2016), we have found that de novo assembly is able to properly construct contigs when mixed viruses are genetically divergent (i.e., when they do not share identical genetic sequences). However, recombination can create genomes that, together, are clearly divergent in some genomic regions and completely identical in others. In this case, de novo assembly can fail to build full-length contigs because it is impossible for the software to determine which reads are from virus 1 or from virus 2. This limitation is not related to our method used to generate DNA from RNA genomes, but it is due to the fact that most NGS methods (i.e., Illumina) cannot deal with long DNA fragments, requiring the shearing of the DNA amplicons prior to sequencing. Other NGS methods (such as PacBio or MinION) can sequence long DNA fragments and could be used to overcome the assembly problems but these methods have a high error rate compared to Illumina. Therefore, there is a balance between the accuracy of the sequences and the length of the contigs that can be generated in cases of mixed recombinant genomes. Nonetheless, this challenge did not compromise the main objective of our method, which was to allow the identification of the enterovirus lineages found in a given sample. The typing of EVs relies on their capsid sequence and intertypic recombination is very uncommon inside this genomic region, the contigs generated through de novo assembly generally span all the capsid-encoding region.

Research has shown the importance of analyzing the complete genome for enteroviruses (Oberste et al., 2004; Joffret et al., 2012) because nucleotidic differences and inter and intra-typic recombination events in non-structural regions differentiate types and lineages (Lukashev et al., 2003; Simmonds, 2006; Combelas et al., 2011). The amplification and sequencing of whole genomic sequences of strains belonging to EV-A, -B, and -C species using generic and specific primers were successfully performed (Tan et al., 2015; Bessaud et al., 2016; Montmayeur et al., 2017; Sahoo et al., 2017). One particular study aimed to isolate polioviruses using random amplification and NGS to better optimize current protocols for whole genome sequencing and identification of a variety of vaccine-derived polioviruses (Montmayeur et al., 2017). Although these methods were able to recover the entire genome for a given type or species, they were not able to sequence mixtures of several human enteroviruses in a single run like our described research. However, a recent study focused on detecting polioviruses and non-polio enteroviruses in cellular supernatants infected with sewage concentrates using NGS and random primers or specific primers targeting polioviruses (Majumdar et al., 2018).

Contrary to these previous studies, our assay was designed to specifically amplify enterovirus of species A to D, capturing the diversity of EVs. Because the assay includes species-specific primers, we can increase the amplification of the respective targeted species. This is not the case with random primers that cannot be ensured to capture all enteroviruses within a given species. In addition, the assay can be simplified to target one particular species. Although strategies based on random amplification have the advantage of being able to amplify sequences of viruses that are unexpected or unknown, they have the disadvantage of decreasing the number of relevant reads since the sequence of non-viral origins is also amplified and sequenced. Our strategy is more suitable to specifically amplify the relevant sequences, limiting the requested depth of coverage, which favors multiplexing and thus reducing the cost per genome and per sample.

In conclusion, the method described in this study enables the specific identification of all enteroviruses present in samples during a single sequencing run. This type of assay would be useful when analyzing human and environmental field samples as indicated in our results.

This contributes to the study of EV diversity and ecosystem within given populations. In addition, this method provided a way to collect full genome sequencing data from mixtures of enteroviruses within the four species A to D that were present in cellular supernatants. These data become a necessity for surveillance purposes and when studying the relationships between the genetic characteristics, including the mosaic features of their genomes acquired through frequent intra- and intertypic recombination and their biological properties. Indeed, mutations and recombination events can be implicated in reintroducing virulence factors and have been involved in the evolution of enteroviruses (Riquet et al., 2008; Jegouic et al., 2009; Joffret et al., 2012; Holmblat et al., 2014).

Statements

Author contributions

M-LJ, PP, and FD conceived and designed the experiments and analyzed the data. M-LJ and PP performed the experiments. RR and J-MH contributed to samples and experiments in Madagascar. PP and M-LJ wrote the manuscript. MB, RR, J-MH, and FD critically revised.

Funding

The collection and shipment of samples were supported by the Centers for Disease Control and Prevention through “Intensive Virologic Monitoring of the tOPV/bOPV switch” Project. This work was supported by the Institut Pasteur (PTR 484), and by the Foundation Total grant S-CM15010-05B (http://fondation.total/fr). Patsy Polston is a postdoc sponsored in part by a grant from the Pasteur Foundation and the Dennis and Mireille Gillings Foundation.

Acknowledgments

The authors are indebted to Maud Vanpeene, Andrea Alexandru, Sobhy Wilhame, and Vincent Enouf [Institut Pasteur, Pasteur International Bioresources network (PIBNET), Plateforme de microbiologie mutualisée (P2M), Paris, France] for performing the sequencing experiments. Thanks to Deborah Delaune and Anne-Lou Pinon for their assistance with performing some experiments. We would like to thank the staff involved in the sampling collection and isolation from Institut Pasteur de Madagascar (Seta Andriamamonjy, Fitahiana Michael Rakotoarison, and Sandratana Raharinantoanina). In addition, we are grateful for the isolates used for this study from Anda Baicus (Romania) and Serge Sadeuh-Mba (Cameroon).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1

    BessaudM.Sadeuh-MbaS. A.JoffretM.-L.RazafindratsimandresyR.PolstonP.VolleR.et al (2016). Whole genome sequencing of Enterovirus species C isolates by high-throughput sequencing: development of generic primers.Front. Microbiol.7:1294. 10.3389/fmicb.2016.01294

  • 2

    BlomqvistS.RoivainenM. (2016). Isolation and characterization of enteroviruses from clinical samples.Methods Mol. Biol.13871928. 10.1007/978-1-4939-3292-4_3

  • 3

    CaroV.GuillotS.DelpeyrouxF.CrainicR. (2001). Molecular strategy for ‘serotyping’of human enteroviruses.J. Gen. Virol.827991. 10.1099/0022-1317-82-1-79

  • 4

    CombelasN.HolmblatB.JoffretM. L.Colbere-GarapinF.DelpeyrouxF. (2011). Recombination between poliovirus and coxsackie A viruses of species C: a model of viral genetic plasticity and emergence.Viruses314601484. 10.3390/v3081460

  • 5

    HolmblatB.JégouicS.MuslinC.BlondelB.JoffretM.-L.DelpeyrouxF. (2014). Nonhomologous recombination between defective poliovirus and coxsackievirus genomes suggests a new model of genetic plasticity for picornaviruses.mBio5:e1119-14. 10.1128/mBio.01119-14

  • 6

    JegouicS.JoffretM. L.BlanchardC.RiquetF. B.PerretC.PelletierI.et al (2009). Recombination between polioviruses and co-circulating Coxsackie A viruses: role in the emergence of pathogenic vaccine-derived polioviruses.PLoS Pathog.5:e1000412. 10.1371/journal.ppat.1000412

  • 7

    JoffretM. L.JegouicS.BessaudM.BalanantJ.TranC.CaroV.et al (2012). Common and diverse features of cocirculating type 2 and 3 recombinant vaccine-derived polioviruses isolated from patients with poliomyelitis and healthy children.J. Infect. Dis.20513631373. 10.1093/infdis/jis204

  • 8

    LukashevA. N.LashkevichV. A.IvanovaO. E.KorolevaG. A.HinkkanenA. E.IlonenJ. (2003). Recombination in circulating enteroviruses.J. Virol.771042310431. 10.1128/jvi.77.19.10423-10431.2003

  • 9

    MajumdarM.KlapsaD.WiltonT.AkelloJ.AnscombeC.AllenD.et al (2018). Isolation of vaccine-like poliovirus strains in sewage samples from the United Kingdom.J. Infect. Dis.21712221230. 10.1093/infdis/jix667

  • 10

    MontmayeurA. M.NgT. F.SchmidtA.ZhaoK.MaganaL.IberJ.et al (2017). High-throughput next-generation sequencing of polioviruses.J. Clin. Microbiol.55606615. 10.1128/JCM.02121-16

  • 11

    NijhuisM.van MaarseveenN.SchuurmanR.VerkuijlenS.de VosM.HendriksenK.et al (2002). Rapid and sensitive routine detection of all members of the genus enterovirus in different clinical specimens by real-time PCR.J. Clin. Microbiol.4036663670. 10.1128/JCM.40.10.3666-3670.2002

  • 12

    NixW. A.ObersteM. S.PallanschM. A. (2006). Sensitive, seminested PCR amplification of VP1 sequences for direct identification of all enterovirus serotypes from original clinical specimens.J. Clin. Microbiol.4426982704. 10.1128/JCM.00542-06

  • 13

    ObersteM. S.MaherK.KilpatrickD. R.FlemisterM. R.BrownB. A.PallanschM. A. (1999a). Typing of human enteroviruses by partial sequencing of VP1.J. Clin. Microbiol.3712881293.

  • 14

    ObersteM. S.MaherK.KilpatrickD. R.PallanschM. A. (1999b). Molecular evolution of the human enteroviruses: correlation of serotype with VP1 sequence and application to picornavirus classification.J. Virol.7319411948.

  • 15

    ObersteM. S.MaherK.PallanschM. A. (2004). Evidence for frequent recombination within species human enterovirus B based on complete genomic sequences of all thirty-seven serotypes.J. Virol.78855867. 10.1128/JVI.78.2.855-867.2004

  • 16

    PohC. L.TanE. L. (2010). “Detection of enteroviruses from clinical specimens,” in Diagnostic Virology Protocols, edsStephensonJ. R.WarnesA. (Berlin: Springer), 6577. 10.1007/978-1-60761-817-1_5

  • 17

    RiquetF. B.BlanchardC.JegouicS.BalanantJ.GuillotS.VibetM. A.et al (2008). Impact of exogenous sequences on the characteristics of an epidemic type 2 recombinant vaccine-derived poliovirus.J. Virol.8289278932. 10.1128/JVI.00239-08

  • 18

    Sadeuh-MbaS. A.BessaudM.MassenetD.JoffretM. L.EndegueM. C.NjouomR.et al (2013). High frequency and diversity of species C enteroviruses in Cameroon and neighboring countries.J. Clin. Microbiol.51759770. 10.1128/JCM.02119-12

  • 19

    SahooM. K.HolubarM.HuangC.Mohamed-HadleyA.LiuY.WaggonerJ. J.et al (2017). Detection of emerging vaccine-related polioviruses by deep sequencing.J. Clin. Microbiol.5521622171. 10.1128/JCM.00144-17

  • 20

    SilvaP. A.DiedrichS.de Paula CardosoD. D.SchreierE. (2008). Identification of enterovirus serotypes by pyrosequencing using multiple sequencing primers.J. Virol. Methods148260264. 10.1016/j.jviromet.2007.10.008

  • 21

    SimmondsP. (2006). Recombination and selection in the evolution of picornaviruses and other mammalian positive-stranded RNA viruses.J. Virol.801112411140. 10.1128/JVI.01076-06

  • 22

    TanL. V.TuyenN. T. K.ThanhT. T.NganT. T.VanH. M. T.SabanathanS.et al (2015). A generic assay for whole-genome amplification and deep sequencing of enterovirus A71.J. Virol. Methods2153036. 10.1016/j.jviromet.2015.02.011

Summary

Keywords

human enteroviruses, whole-genome sequences, high-throughput sequencing, viral mixtures, enterovirus identification

Citation

Joffret M-L, Polston PM, Razafindratsimandresy R, Bessaud M, Heraud J-M and Delpeyroux F (2018) Whole Genome Sequencing of Enteroviruses Species A to D by High-Throughput Sequencing: Application for Viral Mixtures. Front. Microbiol. 9:2339. doi: 10.3389/fmicb.2018.02339

Received

02 July 2018

Accepted

12 September 2018

Published

28 September 2018

Volume

9 - 2018

Edited by

Hirokazu Kimura, Gunma Paz University, Japan

Reviewed by

Flore Rozenberg, Université Paris Descartes, France; Komei Shirabe, Yamaguchi Prefectural Institute of Public Health and Environment, Japan

Updates

Copyright

*Correspondence: Marie-Line Joffret, Patsy M. Polston,

These authors have contributed equally to this work

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics