Whole Genome Analysis of Selected Human Group A Rotavirus Strains Revealed Evolution of DS-1-Like Single- and Double-Gene Reassortant Rotavirus Strains in Pakistan During 2015–2016

Acute gastroenteritis due to group A rotaviruses (RVAs) is the leading cause of infant and childhood morbidity and mortality particularly in developing countries including Pakistan. In this study we have characterized the whole genomes of five RVA strains (PAK56, PAK419, PAK585, PAK622, and PAK663) using the Illumina HiSeq platform. The strains PAK56 and PAK622 exhibited a typical Wa-like genotype constellation (G9-P[8]-I1-R1-C1-M1-A1-N1-T1-E1-H1 and G3-P[8]-I1-R1-C1-M1-A1-N1-T1-E1-H1, respectively), whereas PAK419, PAK585, and PAK663 exhibited distinct DS-1-like genotype constellations (G3P[4]-I2-R2-C2-M2-A2-N2-T1-E2-H2, G1P[8]-I2-R2-C2-M2-A2-N2-T2-E2-H2, and G3P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2, respectively). Despite their DS-1-like genotype constellation, strain PAK585 possessed the typical Wa-like G1P[8] genotypes, whereas both PAK419 and PAK663 possessed the G3 genotype. In addition, PAK419 also possessed the Wa-like NSP3 genotype T1, suggesting that multiple reassortments have occurred. On Phylogenetic analysis, all of the gene segments of the five strains examined in this study were genetically related to globally circulating human G1, G2, G3, G6, G8, G9, and G12 strains. Interestingly, the NSP2 gene of strain PAK419 showed closest relationship with Indian bovine strain (India/HR/B91), suggesting the occurrence of reassortment between human and bovine RVA strains. Furthermore, strains PAK419, PAK585, and PAK663 were closely related to one another in most of their gene segments, indicating that these strains might have been derived from a common ancestor. To our knowledge this is the first whole genome-based molecular characterization of human rotavirus strains in Pakistan. The results of our study will enhance our existing knowledge on the diversity and evolutionary dynamics of novel RVA strains including DS-1-like intergenogroup reassortant strains spreading in Asian countries including Pakistan, in the pre-vaccine era. Therefore, continuous surveillance is recommended to monitor the evolution, spread and genetic stability of novel reassortant rotavirus strains derived from such events.


INTRODUCTION
Group A rotaviruses (RVAs) belong to the family Reoviridae and are responsible for severe dehydrating diarrhea in children less than 5 years of age and in many animal species worldwide . In spite of the universal implementation of two rotavirus vaccines (Rotarix and RotaTeq), in 2016, rotavirus was estimated to cause 128,500 deaths of children less than 5 years of age, of which 13,396 deaths occurred in South Asia (Troeger et al., 2018).
Most of these novel DS-1-like strains were not completely sequenced. Whole genome analysis is a reliable method for obtaining conclusive data on the origin of RVA strain, and for tracing its evolutionary pattern. During the last decade, research on RVA in Pakistan mainly focused on the detection of current RVA genotypes based on VP7 and VP4. To the best of our knowledge, this is the first time genomes of Pakistani RVA strains have been fully sequenced and characterized. In this study, the Illumina HiSeq platform was used to obtain the complete nucleotide sequences of the whole genomes of five Pakistani RVA strains.

Ethics Statement
The study was approved by the Ethical Committee of PIMS, Benazir Bhutto Shaheed Hospital (BBH) and Internal Review Board (IRB) of COMSATS Institute of Information Technology, Islamabad. Written informed consent for molecular characterization of RVAs strains were obtained from the parents/guardians of the study participants.

RVA Strains
RVA strains, PAK56, PAK419, PAK585, PAK622, and PAK663, showed maximum diversity upon phylogenetic analysis were selected for whole genome sequencing. These strains were detected in five fecal samples from hospitalized children aged less than 10 years with severe gastroenteritis admitted to the emergency pediatrics department of BBH (Benazir Bhutto Hospital) in Rawalpindi, and PIMS (Pakistan Institute of Medical Sciences) in Islamabad. These stool samples were collected during RVA surveillance program in Pakistan in 2015-2016 which involved a total of 639 samples (Sadiq et al., 2019). Of the total 639 samples RVA was found positive in 171 (26.8%) of the samples. Stool samples containing above selected strains were kept at −80 • C until use.

Sample Preparation for NGS
Each sample was enriched for viral particles by the NetoVIR (Novel Enrichment Technique Of VIRomes) protocol (Conceição-neto et al., 2015). Briefly, fecal suspensions (10%) were prepared by adding 50 mg of fecal sample in 500 µl of sterile PBS in a clean 1.5 ml centrifuge tube. Fecal suspensions were subjected to homogenization for 1 min at 3000 rpm with a MINILYS homogenizer (Bertin Technologies, Montigny-le-Bretonneux, France) and then centrifuged at 17000 × g for 3 min. The supernatant was filtered through a 0.8 µm (PES) filter (Merck Millipore, Burlington, MA, United States) at 17000 × g for 1 min. After filtration samples were treated for 2 h at 37 • C with a cocktail of 7 µl of 20× homemade buffer (1 M Tris, 100 mM CaCl2 and 30 mM MgCl2, pH = 8), 2 µL of benzonase (Novagen, Madison, WI, United States) and 1 µL of micrococcal nuclease (New England Biolabs, MA, United States). Reaction was stopped by adding 7 µl of 10 nM EDTA. Nucleic acid extraction was conducted by using the QIAamp R Viral RNA Mini kit from Qiagen (Qiagen, Hilden, Germany) according to manufacturer instructions without adding carrier RNA.
cDNA Library Preparation and Illumina Sequencing cDNA synthesis was carried out and random PCR amplification for 17 cycles were performed using a Whole Transcriptome Amplification (WTA) Kit (Sigma-Aldrich) according to the manufacturer's instructions. The PCR products were purified with the MSB Spin PCRapace spin columns (Stratec, Berlin, Germany). The libraries for Illumina sequencing were prepared using the Nextera XT Library Preparation Kit (Illumina, San Diego, CA, United States) according to the manufacturer's guide. Library purification was performed using a 1.8 ratio of Agencourt AMPure XP beads (Beckman Coulter, Inc., Nyon, Switzerland). The quality check of the purified cDNA library was performed on 2100 Bioanalyzer (Agilent Technologies). The libraries were quantified with the KAPA Library Quantification kit (Kapa Biosystems) followed by sequencing on a HiSeq TM 2500 platform (Illumina) for 300 cycles (2 × 150 bp paired ends).

Genomic Assembly
The obtained raw reads were trimmed for quality and adapters, and assembled into contigs de novo using Trimmomatic and SPAdes, respectively (Bankevich et al., 2012;Bolger et al., 2014). Scaffolds were then classified using DIAMOND in sensitive mode (Buchfink et al., 2015). Contigs assigned to RVA were used to map the trimmed reads using the Burrows-Wheeler Alignment tool (BWA) (Li and Durbin, 2009). Open reading frames (ORF) were identified with ORF Finder analysis tool 1 .

Genotyping and Phylogenetic Analysis
Genotypes for all eleven gene segments of RVA were determined by the RotaC v2.0 automated genotyping tool for RVA according to the guidelines proposed by the Rotavirus Classification Working Group (RCWG) (Maes et al., 2009). Multiple nucleotide-based sequence alignments were conducted using ClustalW in MEGA version 6.06 (Tamura et al., 2013). Reference sequences were retrieved from GenBank and phylogenetic trees were constructed by Maximum Likelihood method with kimura-2-parameter model in MEGA 6.06 (Tamura et al., 2013). The statistical reliability was checked using 1000 bootstrap replicates. Nucleotide and amino acid distances were calculated using the P Distance Model. Phylogenetic analysis were performed using appropriate reference strains in addition to the RVA discovered in this study.

Nucleotide Sequence Accession Number
The nucleotide sequences were submitted to the GenBank under the following accession numbers: (

Genotyping Based on Whole-Genome Sequencing
In order to gain insight into the genetic variability among strains PAK56, PAK419, PAK585, PAK622, and PAK663 and their genetic relatedness with other RVA strains worldwide, the fullgenome sequences of these five strains were determined using the NetoVIR protocol in combination with NGS Illumina HiSeq sequencing. Full or nearly full length nucleotide sequences reads were obtained for all the 11 gene segments of the five human rotavirus strains via Illumina HiSeq platform. The contigs length and number of reads mapped against the de novo assembled RVA genome contigs are given in the Table 1. Comparison of the complete genotype constellations of these strains with those of representative G1, G3, G9 and other RVA strains is shown in Figure 1. The eleven gene segments of two strains (PAK56 and PAK622) were found to have a typical Wa-like genotype constellation: G9-P . Three strains (PAK419, PAK585, and PAK663) comprised genotypes typical of both the Walike and DS-1-like genotype:

Phylogenetic Analysis
Phylogenetic trees were constructed for each of the 11 segments of the five RVA strains obtained in this study, together with representative strains from GenBank.
The VP7 gene of strain PAK56 clustered closely with G9-lineage three strains from all over the world (Belgium, Bangladesh, Italy, Russia and Korea), and exhibited maximum nucleotide (nt) identities (98.9-99.5%) with G9 strains in Korea (CAU08-350) and Russia (H508). The VP7 gene of strain PAK585 clustered closely with G1-lineage 1 strains from all over the world (India, Australia, United States, Russia, Philippines and previously reported strains from Pakistan), and showed maximum nucleotide (nt) identities (99.3-99.4%) with G1 strain from Australia and India (CK00084, RV0911, respectively). On the other hand, the VP7 genes of strains PAK419, PAK622, and PAK663 clustered together with G3 strains belong to lineage three reported worldwide (United States, Russia, Paraguay, and China), and exhibited highest nucleotide identities (99.2-99.3%) with each other and with G3 strains from Russia (Nov11-N2317) and United States (SSCRTV_00017) (Figure 2).
The VP4 genes of strains PAK419 and PAK663 clustered closely with other strains identified in P[4] lineage 5 from all over the world (United States, Australia, and Bangladesh), and showed maximum nucleotide identities (98-99.2%) with each other and with DS-1-like strains in United States (LB1562) and Bangladesh (J331) In contrast, VP4 gene of strains PAK56, PAK622, and PAK585 clustered closely with other strains isolated in OP354like P[8] lineage 4 isolated from other countries of the world, and showed maximum nucleotide identities (98-98.7%) with each other and with P[8] rotavirus strain from Russia (Nov10-N53) and Israel (R5751) (Figure 3).
The NSP2 and NSP3 genes of strains PAK56 and PAK622 clustered closely together with other strains reported in NSP2 and NSP3 genotypes N1 and T1, respectively from all over the world (Belgium, Korea, and Italy). The above gene segments of these strains exhibited maximum nucleotide identities (99.4-99.9%) to each other and with Wa-like G1P[8] strains from Belgium (00045) and Korea (C1-81) and comparable identities (97.7-98%) to a cow rotavirus strain from Uganda found in a common branch. The NSP2 gene of strain PAK419 clustered closely together with other globally circulating human and animal rotavirus strains identified in NSP2 genotype N2, and showed maximum nucleotide (nt) identity (98.3%) with Indian bovine strain (HR/B91). While, NSP3 gene segment of PAK419 clustered closely together with strains reported worldwide in NSP3 genotype T1, and showed 98.9% nucleotide identity with Chinese Wa-like strain (E2484/G4P[8]). In contrast, the NSP2 and NSP3 genes of strains PAK585 and PAK663 clustered closely together with other strains identified in previously established genotypes of NSP2 and NSP3 (N2 and T2, respectively) from all over the world (Philippines, United States, India, Bangladesh, and Malawi), and showed nucleotide identities of (96.4-97.3%) and (98-99.5%) with each other and with that of DS-1 like strains from the Philippines (TGO12-007), United States (LB1562), India (KOL-006), respectively (Figures 4, 5).
The VP2 genes of strains PAK419, PAK585, and PAK663 clustered closely together with other strains identified in VP2 lineage C2 from all over the world (Belgium, Australia and previously reported strain from Pakistan), and exhibited maximum nucleotide similarities (99.1-99.2%) with cocirculating DS-1-like strain from Pakistan (PAK93) and comparable identities (98.5-98.7%) with Australian DS-1-like strain (CK20024/G2P[4]). On the other hand, the VP2 gene of strains PAK56 and PAK622 clustered with other strains identified in VP2 lineage C1 from all over the world (Belgium, Bangladesh, India, Africa, United States, Australia, Korea, Italy, Egypt, Vietnam, and Pakistan) and showed maximum nucleotide identities (99.1-99.3%) with Wa-like strains from Australia (Victoria/CK00037) and Belgium (BE00031) and previously reported strain from Pakistan (HF66) (Figure 12).

DISCUSSION
In the present study, the whole genomes of five human RVA strains (PAK56, PAK419, PAK585, PAK622, and PAK663) were characterized for the first time in children with severe gastroenteritis in Pakistan in 2015-2016. The two strains (PAK56 and PAK622) exhibited a typical Wa-like backbone while, three strains (PAK419, PAK585, and PAK663) showed genotype constellation involving genogroup 1 and genogroup 2 genes. The unique genotype constellation of (PAK419, PAK585, and PAK663) were found to be similar to DS-1-like intergenogroup reassortant strains. Pakistan has implemented RVA vaccine in national immunization FIGURE 2 | Phylogenetic tree constructed from the nucleotide sequences of the VP7 genes of the study strains and representative RVA strains. RVA strains sequenced in this study are represented by the red color. The vaccine strains and an out group strain are represented by Blue and green color, respectively while black shading represent strains isolated all over the world. The RVA strains sequenced in this study and reference strains obtained from GenBank database are represented by Accession number, Strain Name, Country and year of Isolation. Scale bar: 0.05 substitutions per nucleotide. Bootstrap values <60 are not shown.  program in January 2018, so the emergence of intergenogroup reassorted strains in the present study are not linked to RVA vaccine introduction.
The DS-1-like G1P[8] double-gene reassortant strains were first detected in Japan followed by cases reported in Thailand, Vietnam, Malawi, and Brazil (Komoto et al., 2010Nakagomi et al., 2012;Fujii et al., 2014;Kuzuya et al., 2014;Yamamoto et al., 2014;Jere et al., 2018;Luchs et al., 2019). The unusual DS-1-like G1P[8] double-gene reassortant strain PAK585 detected in the present study showed close similarity with other DS-1-like, DS-1-like G1P[8] and Wa-like strains circulating globally (India, Australia, United States, Thailand, Bangladesh, Malawi, and Philippines), as observed in the phylogenetic tree. The analysis of PAK585 revealed that eight out of eleven gene segments (VP1-VP3, VP6, NSP1-NSP3, and NSP5) of this strain showed close similarities with DS-1-like intergenogroup reassorted strains or typical DS-1 like strains circulating worldwide. One segment (NSP4) showed close nucleotide similarities (97.6-98.2%) with human strains from Australia, Thailand, Philippines, and previously reported strains from Pakistan. While the remaining two segments (VP7 and VP4) were assumed to be derived from typical human Wa-like strains circulating worldwide. Therefore, the emergence of this unusual Pakistani G1P[8] strain more likely was due to their spread from other Asian countries into Pakistan, rather than the local emergence through multiple reassortment event(s) between locally circulating Ds-1-like and Wa-like G1P[8] strains.
The G3 genotype is documented as the third most predominant RVA genotype in humans mostly found in combination with P[8] (Atima et al., 2016). The G3 rotavirus strains has wide host range comprising cows, dogs, cats, horses rabbits, pigs, bats, sheeps, and monkeys in association with many P genotypes (Matthijnssens et al., 2011a). In this study we have detected two strains possessing the G3P[4] genotype, a single gene reassortant strain (PAK663/G3P[4]) and double gene reassortant starin (PAK419/G3P[4]). These two strains showed close relationship with G2P[4], G3P[8], G9P[4], and G12P[6] strains identified worldwide (United States, Thailand, India, Bangladesh, Malawi, Philippines, and Australia) and previously reported G1P[6] strain in Pakistan. These unusual G3 strains shared nine gene segments (VP1-VP4, VP6, NSP1-NSP2, and NSP4-NSP5) with typical DS-1-like or DS-1-like intergenogroup reassortant strains, whereas, one segment (VP7) was assumed to be of typical human Wa-like in the phylogenetic tree. Notably, NSP3 gene of strain PAK419 was observed with an additional reassortment event in which T2 genotype of strain PAK419 was replaced with human Wa-like genotype T1. So, it is suggested that strain PAK419 and PAK663 have been derived in Pakistan due to multiple reassortments events between locally and globally circulating human Ds-1-like, Wa-Like, human like bovine or bovine strains. Moreover, on phylogenetic observation, PAK663 and PAK 419 strains were found to be close to each other in their maximum gene segment, suggesting they have evolved from a common ancestor.    FIGURE 9 | Phylogenetic tree constructed from the nucleotide sequences of the NSP1 genes of the study strains and representative RVA strains. RVA strains sequenced in this study are represented by the red color. The vaccine strains and an out group strain are represented by Blue and green color, respectively while black shading represent strains isolated all over the world. The RVA strains sequenced in this study and reference strains obtained from GenBank database are represented by Accession number, Strain Name, Country and year of Isolation. Scale bar: 0.1 substitutions per nucleotide. Bootstrap values <60 are not shown.
FIGURE 10 | Phylogenetic tree constructed from the nucleotide sequences of the NSP4 genes of the study strains and representative RVA strains. RVA strains sequenced in this study are represented by the red color. The vaccine strains and an out group strain are represented by Blue and green color, respectively while black shading represent strains isolated all over the world. The RVA strains sequenced in this study and reference strains obtained from GenBank database are represented by Accession number, Strain Name, Country and year of Isolation. Scale bar: 0.5 substitutions per nucleotide. Bootstrap values <60 are not shown.
FIGURE 11 | Phylogenetic tree constructed from the nucleotide sequences of the NSP5 genes of the study strains and representative RVA strains. RVA strains sequenced in this study are represented by the red color. The vaccine strains and an out group strain are represented by Blue and green color, respectively while black shading represent strains isolated all over the world. The RVA strains sequenced in this study and reference strains obtained from GenBank database are represented by Accession number, Strain Name, Country and year of Isolation. Scale bar: 0.05 substitutions per nucleotide. Bootstrap values <60 are not shown.
FIGURE 12 | Phylogenetic tree constructed from the nucleotide sequences of the VP2 genes of the study strains and representative RVA strains. RVA strains sequenced in this study are represented by the red color. The vaccine strains and an out group strain are represented by Blue and green color, respectively while black shading represent strains isolated all over the world. The RVA strains sequenced in this study and reference strains obtained from GenBank database are represented by Accession number, Strain Name, Country and year of Isolation. Scale bar: 0.05 substitutions per nucleotide. Bootstrap values <60 are not shown.
On the other hand, two Pakistani strains PAK56 and PAK622 having genotypes G9P[8] and G3P[8] detected in the present study were found to be very similar to a typical human Wa-like strains reported in several countries worldwide. The analysis of ten of the eleven gene segments (VP1-VP4, VP6-VP7, NSP1-NSP4) of strains PAK56 and PAK 622 showed a typical human Wa-like origin. While, gene segment NSP5 showed close relationship with human rotavirus strains detected in Korea and Africa and with sheep and Alpaca rotavirus strains from Peru in the same cluster. These observation suggests that interspecies transmission and reassortments might have occurred between human/caprine rotavirus strains. Also on the phylogenetic tree, these two strains were clustering closely in their maximum gene segments suggestive of their origin from a common ancestor.

CONCLUSION
This is the first full genome characterization of human rotavirus strains circulating in Pakistan. The evolution of DS-1-like intergenogroup reassortant strains having G1P[8] and G3P[4] genotypes in Pakistan suggested the constant circulation of these unusual rotavirus strains and ongoing reassortment associated with them particularly in Asia. After WHO recommendation in 2009, the universal implementation of RVA vaccine is continue to increase globally. Whether or not the coinciding rise of strains with Wa-like VP4 and VP7 genes together with a DS-1-like backbone and mass vaccine introduction is consequential or purely random remains enigmatic. Therefore, further investigations via continuous surveillance of these strains is recommended to monitor the evolution, spread and genetic stability of novel reassortant rotavirus strains derived from such events. Pakistan has licensed RVA vaccine in national EPI program in January 2018. After the introduction of RVA vaccine in the country's immunization program the careful monitoring is needed to estimate vaccine effectiveness and challenges to current vaccine strategies against such unusual strains.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the nucleotide sequences were submitted to the GenBank under the following accession numbers: (VP7)

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethical Committee of PIMS, Benazir Bhutto Shaheed Hospital (BBH) and Internal Review Board (IRB) of COMSATS Institute of Information Technology. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin. Written informed consent was obtained from the minor(s)' legal guardian/next of kin for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
AS and NB conceived concept and designed the experiments. AS performed the experiments and wrote the manuscript. NB, HB, KY, and JM contributed the reagents and data analysis tools and helped in write-ups.