Metatranscriptomic assessment of diarrhoeic faeces reveals diverse RNA viruses in rotavirus group A infected piglets and calves from India

RNA viruses are a major group contributing to emerging infectious diseases and neonatal diarrhoea, causing morbidity and mortality in humans and animals. Hence, the present study investigated the metatranscriptomic-derived faecal RNA virome in rotavirus group A (RVA)-infected diarrheic piglets and calves from India. The viral genomes retrieved belonged to Astroviridae in both species, while Reoviridae and Picornaviridae were found only in piglets. The nearly complete genomes of porcine RVA (2), astrovirus (AstV) (6), enterovirus G (EVG) (2), porcine sapelovirus (PSV) (2), Aichivirus C (1), and porcine teschovirus (PTV) (1) were identified and characterised. In the piglet, AstVs of PAstV2 (MAstV-26) and PAstV4 (MAstV-31) lineages were predominant, followed by porcine RVA, EVG, PSV, Aichivirus C, teschovirus (PTV-17) in decreasing order of sequence reads. In contrast, AstV accounted for the majority of reads in bovines and belonged to MAstV-28 and a proposed MAstV-35. Both RVA G4P[6] strains exhibited prototype Gottfried strains like a genotypic constellation of G4-P[6]-I1-R1-C1-M1-A8-N1-T1-E1-H1. Ten out of eleven genes were of porcine origin, while the VP7 gene clustered with G4-lineage-1, consisting of human strains, suggesting a natural porcine-human reassortant. In the recombination analysis, multiple recombination events were detected in the PAstV4 and PAstV2 genomes, pointing out that these viruses were potential recombinants. Finally, the study finds diverse RNA virome in Indian piglets and calves for the first time, which may have contributed to diarrhoea. In the future, the investigation of RNA virome in animals will help in revealing pathogen diversity in multifactorial diseases, disease outbreaks, monitoring circulating viruses, viral discovery, and evaluation of their zoonotic potential.


Introduction
RNA viruses, the most important group involved in zoonotic infections, pose a challenge for safeguarding public and animal health.RNA viruses, because of their broad host range, great genetic diversity, higher mutation rate, and recombination capacity, often emerge as new viruses (Papp et al., 2013a;Shi et al., 2018;Werid et al., 2022).The major RNA viruses causing diarrhoea including rotavirus A-C and H, porcine epidemic diarrhoea virus, and transmissible gastroenteritis virus, are in pigs, while rotavirus A-C, bovine viral diarrhoea virus, and coronavirus are in calves.Among them, rotavirus is a major enteric pathogen causing diarrhoea (Crawford et al., 2017).Other RNA viruses, such as caliciviruses (norovirus, sapovirus), astroviruses (AstVs), picornaviruses (kobuvirus, teschovirus, enteroviruses), and toroviruses, are detected as coinfections or superinfections with or without clinical symptoms (Su et al., 2020).Some of these porcine and bovine viruses are genetically more similar to human viruses and are considered potentially zoonotic (Papp et al., 2013b;Werid et al., 2022).
'Porcine enteroviruses' members of the family Picornavirdae constitute viruses from genera porcine teschovirus (PTV1-13), enterovirus (EVG1-20), sapelovirus (SV-A) and kobuvirus (Aichivirus C).The PTV infections in endemic situations are asymptomatic; however, some virulent strains cause neurologic disease, mild (Talfan disease) or severe (Teschen disease), and a range of systemic infections, including diarrhoea, respiratory illness, and myocarditis.Porcine sapeloviruses (PSV) are causative agents of diarrhoea, respiratory and reproductive presentations, and more recently, polioencephalomyelitis.The fourth virus, Aichivirus C (Genus Kobuvirus), is detected in healthy and diarrheic piglets, and it is suggested that it can induce diarrhoea in the presence of viruses like rotavirus and transmissible gastroenteritis (Cao et al., 2012).The zoonotic potential is suggested for kobuviruses based on their homology with human kobuviruses and the transmission of bovine kobuvirus to pigs (Ibrahim et al., 2022;Werid et al., 2022).
The global knowledge of RNA virus diversity is limited by dependence on cell culture and reverse transcriptase polymerase chain reaction (RT-PCR) for virus discovery, priority on virus pathogens in humans, or economic importance in animals (Chen et al., 2022).Fortunately, the metatranscriptome, which detects transcribed RNA, has proved to be a powerful tool for discovering the entire virome of humans, animals, and plants.At present, only a few studies have explored the enteric RNA virome of piglets and calves and none from India (Chen et al., 2018;Cortey et al., 2019;Smol ̌ak et al., 2022).However, a wide knowledge gap exists on the diversity of animal RNA viruses in India.Since RNA viruses have the potential to inflict severe disease outbreaks and have zoonotic importance, unraveling their diversity is of immense importance for the safeguarding of human and animal health.The present study employed the metatranscriptomic approach to detect and recover the whole genomes of some previously uncharacterised viruses in diarrheic Indian piglets and calves.

Sample preparation and nucleic acid extraction
The faecal samples were randomly collected for the period 2017 to 2019 from diarrheic and apparently healthy piglets and calves under six months of age from Chandkhed (22) and Tathawade (17) in Western Maharashtra, India.All the samples were transported on ice packs and stored at -70°C.The samples were tested for RVA by VP6 based RT-PCR using previously published primers (Iturriza Goḿara et al., 2002).The representative RVA positive samples from piglets (NIV1740786 and NIV1740787) and calves (NIV198014 & NIV198016) were then enriched for viral particles by the NetoVIR (Novel Enrichment Technique of VIRomes) technique (Conceico-Neto et al., 2018).The faecal suspensions (10%) were made simply by vortexing 50 mg of the faeces with 500 µl of sterile phosphate buffer saline in a clean 1.5 ml microcentrifuge tube.After that, they were centrifuged for 3 minutes at 17000 g.A 0.22µm filter (Merck Millipore, Burlington, MA, United States) was used to filter the supernatant.After that, for removal of the free nucleic acids, the filtrate was incubated for 2 hours at 37°C with a combination of benzonase (Novagen, Madison, WI, United States), micrococcal nuclease (New England Biolabs, MA, United States), and 7 µl of homemade buffer (20 mM Tris, 100 mM CaCl 2 , and 30 mM MgCl 2 , pH = 8).By adding 7µl of 10 nM EDTA, the process was stopped.The QIAAmp ® Viral RNA Mini kit (Qiagen, Hilden, Germany) was used to extract the nucleic acids as directed by the manufacturer without the use of carrier RNA.

Library preparation and sequencing
At Genotypic Technology Pvt. Ltd., Bangalore, India, RNA sequencing libraries were prepared using the NEBNext ® UltraTM II Directional RNA Library Prep Kit (New England BioLabs, MA, USA), which is compatible with the Illumina platform.For priming and fragmentation, 50ng of total RNA was collected.Further, firststrand cDNA synthesis and second-strand cDNA synthesis were applied to fragmented and primed RNA.Using JetSeq Beads (Bioline, Cat.No. BIO-68031), the double-stranded cDNA was purified.Following end-repair, adenylation, and ligation of purified cDNA to Illumina multiplex barcode adapters by the NEBNext ® UltraTM II Directional RNA Library Prep methodology, second-strand excision was performed using the USER enzyme at 37°C for 15 minutes.Illumina Universal Adapters (5'-AATGATACGGCGACCGAGATCTACACT TTCCCTACACGACGCTCTTCCGATCT-3') and Index Adapter (5'-GATCGGAAGAGCACACGTCTGAACTCCAGTCAC [INDEX] ATCTCGTATGCCGTTCTGCTTG-3') have been used in the study.JetSeq Beads were used to purify the adapter-ligated cDNA, and 11 cycles of indexing (98°C for 30 sec) and cycling (98°C for 10 sec, 65°C for 75 sec, and 65°C for 5 min) were used to enrich the adapter-ligated fragments.Following a library-quality control check, JetSeq Beads were used to purify the final PCR result (sequencing libraries).The Qubit fluorometer (Thermo Fisher Scientific, MA, USA) was used to quantify Illumina-compatible sequencing libraries, and the Agilent 2200 Tapestation was used to examine the fragment size distribution.Using the Illumina Hiseq (150 x 2 chemistry) instrument, libraries were sequenced (Supplementary File 1).

Sequence analysis
Sequencing was performed on the Illumina platform in pairend sequencing format for all four samples.The raw data files were subjected to quality checks using the Linux-compatible FastQC v0.11.9 tool.The low-quality reads, adapters, and primers were removed with the Trimmomatic tool available on the Galaxy server (https://usegalaxy.org/).The cleaned reads were classified with the Kraken viral 2020 database using the Kraken tool (version 1.3.1)with default parameters on the Galaxy server.The Krona tools (version 2.7.1) on Galaxy server were then used to generate the interactive charts for the hierarchical classification and visualization of classified viruses in pie charts.For the further characterization of the virus, all de-multiplexed reads of an interested taxon were filtered, extracted, and de novo assembled with the Shovill Faster SPAdes assembler on the Galaxy server (version 1.1.0+galaxy1)with the assembly options for paired-end reads.For further validation and identification, the assembled contigs from each viral sample were annotated using the standalone blast (NCBI-BLAST+" tool v2.9.0) with a custom viral genome sequence database.

Phylogenetic analysis
The VP7 and VP4 (RVA), ORF1a, ORF1b, and ORF2 (AstV), and polyprotein (EVG, PSV, Aichivirus C, and PTV) nucleotide/ amino acid sequences of detected viruses were aligned with the same gene or region of prototype strains with the help of the online MAFTT v7 program with default parameters.The phylogenetic trees were constructed by the neighbour-joining tree using the Kimura 2-parameter model (nucleotide sequences) and the P Distance Model (amino acid sequences) in MEGA 11.0 software (Tamura et al., 2021).The reliability of different phylogenetic groupings was confirmed by 1000 bootstrap replications, and amino acid distances were calculated using the P Distance Model.

Recombination analysis
The nucleotide sequences of the complete genomes of the detected viruses were aligned with datasets used in phylogenetic analysis to detect recombinants.Then SIMPLOT v3.5.1.was used to identify potential recombination sites (Lole et al., 1999).The bootscaning in the same programme was applied to identify the phylogenetically informative sites supporting alternative tree topologies (Salminen et al., 1995).

Results
3.1 Mixed RNA viruses were found in RVA-positive samples We were able to generate the raw Illumina reads for four samples in the range of 2.2-4.9 million.The final processed reads, after adapter trimming and mapping against their respective host genomes, for NIV1740786, NIV1740787, NIV198014, and NIV198016 samples were 235506, 270336, 460987, and 371965, respectively.The reads from each sample were taxonomically classified and visualized as pie charts by using Kraken and Krona tools.In the porcine samples, NIV1740786 & NIV1740787, reads of PAstV, porcine RVA, porcine enterovirus G (data unpublished), PSV, Aichivirus C, and PTV, made up, of 48.59%, 14%, 10%, 8%, 7%, 0.8%, and 58.27%, 35%, 2%, 2%, 1%, and 0.08%, respectively, of all aligned reads (Figures 1A, B).In the bovine samples, NIV198014 and NIV198016, sequences belonging to BAstV and bovine RVA, made up 97.02%, 0.002%, and 96.38%, 0.001% of all aligned reads (Figures 1C, D).Upon BLASTn analysis, the final next-generation sequencing (NGS) reads showed homologies to several porcine and bovine viruses (Table 1; Supplementary File 2).A large number of reads were labeled as uncategorised.Further analysis of these reads may reveal previously unidentified viruses.The sequence data generated in the present study were deposited in the NCBI Sequence Read Archive, https://www.ncbi.nlm.nih.gov/sra(Accession number: PRJNA941130).

Picornavirus
Eight genomes belonging to the family Picornaviridae were assembled.Out of which, two identical complete ORFs encoding a single polyprotein of 2323 AA were obtained for PSV (contig00002-NIV1740786 & NIV1740787).The detected PSV strain has a maximum 97.5% AA identity to the Japanese HkKa2-3 and HkKa2-2 strains (Figure 3), while it was distantly related to only two available full genomes of Indian WB_76_tc (94.05%) and SPFC-6 (93.31%) strains.Although both study strains didn't show insertions or deletions at the 3' end of VP1 as shown by multiple sapelovirus strains, the mutation at the 898-900 AA site from PAT to TAE was detected in comparison to the prototype strain (UK/V13).
One complete (contig00001-NIV1740786) and one partial sequence (contig00006-NIV1740787) of ORF coding for polyprotein 2459 AA and 1537AA, respectively, of Aichivirus C was retrieved.At the amino acid level, both contigs were 99.7% identical.The amino acid identity between the contig00001-NIV1740786 and the complete polyprotein gene sequences available in the NCBI database varied from 99.93 to 99.97%, with a maximum of 99.0% for the American OH/RV50/2011 and Japanese Ishi-Ta4 strains and the lowest identity for the swine/ HBYT/2018/China strain (Figure 4).The partial sequence contig00006-NIV1740787 was 96.7% identical to the above strains.
The complete sequences of polyprotein genes consisting of 2238 AA were obtained for one PTV strain (contig00003-NIV1740786).It exhibited 94.7-95% AA identity to the Chinese SA9 (PTV-17) and SWU-M strains (Figure 5).The VP1 gene of the study strain, when aligned with the prototype and proposed genotype sequences, showed 90.1-90.8%AA identity with PTV-17 Chinese SA9 and PTV-China/SWU-M/2020 strains.
The maximum and minimum identity of gene are colour coded as green and red, respectively.

Recombination analysis
The potential recombinant events were identified in the PAstV4 and PAstV2.The PAstV4 strains (contig00004-NIV1740786 & NIV1740787) were recombinants arising from strains PoAstV4/ 35/USA/2010 and PoAstV4/CH/JXZS/2014 (Figure 7A).The boot scan plot observed a breakpoint at nucleotide (nt) position 4155.The phylogenetic trees showed that before the breakpoint, PAstV4 study strains clustered with the 35/USA strain, whereas after the breakpoint they clustered with the JXZS strain (Figures 7B, C).The PAstV2 (contig00005-NIV1740787) strain was recombinant with The phylogenetic tree based on nucleotide sequences of the VP7 and VP4 genes of Indian porcine G4P[6] RVA strain: The neighbour-joining tree was established using the Kimura 2-parameter model with 1000 bootstrap replicates in MEGA 11.0 software.The bar represents the genetic distance, while numbers indicate the bootstrap replicates (>75% are shown at branch points).The G4 and P[6] lineages assigned as per previous reports (Wandera et al., 2021).The detected strains are marked with a black circle.
Xinjiang mamastrovirus 7/227-67505, PoAstV2/Bel-12R021, and PoAstV2/Sichuan mamastrovirus 7/isolate s68-555025 (Figure 8A).The three breakpoints were visible at 2795, 4571, and 5373 nt positions.To confirm the results, phylogenetic trees were constructed using all the fragments of the recombinant virus (Figures 8B, D).The PAstV2 study strain was grouped with the 227-67505 strain before 2795, the Bel-12R021 strain between 2795-4571, and the s68-555025 strain between 4571-5373 fragments.The above observations indicated that the PAstV4 strain (contig00004-NIV1740786) and PAstV2 strain (contig00005-NIV1740787) might be a potent recombinant.In the other detected viruses evidence of recombination was not detected.The phylogenetic tree based on amino acid sequences of the sapelovirus polyprotein: The neighbour-joining tree was established using the pairwise distance model with 1000 bootstrap replicates in MEGA 11.0 software.The bar represents the genetic distance while numbers indicate the bootstrap replicates (>75% are shown at branch points).The detected strains are marked with a black circle.

Discussion
Diarrhoea has multiple etiologies and causes huge economic losses to cattle and pig farmers worldwide.Traditionally, infectious causes of diarrhoea were diagnosed as single etiologies.However, with next-generation sequencing, the evidence indicates that many microbes are present in the faeces of animals.The present metatranscriptome study of RVA-positive porcine diarrheic faecal samples revealed the presence of diverse RNA viruses, including picornavirus and AstV, in piglets, while only AstV was found in bovine samples.
The G4P[6] genotype combination in piglets is common but considered unusual in humans (Papp et al., 2013b;Zhou et al., 2015).Although the G4P[6] study strains resembled the prototype The phylogenetic tree based on amino acid sequences of porcine aichivirus C polyprotein: The neighbour-joining tree was established using the pairwise distance model with 1000 bootstrap replicates in MEGA 11.0 software.The bar represents the genetic distance, while numbers indicate the bootstrap replicates (>75% are shown at branch points).The detected strains are marked with a black circle.
Gottfried genotype constellation, the phylogenetic analysis revealed that the VP7 gene was more similar to those of human strains in G4 lineage-1, indicating RVA genome reassortment during coinfection and interspecies transmission.However, such anthrapozoontic and zooanthraponotic transmission of G4P[6] strains with or without reassortment might be in place for quite a long time in India (Mukherjee et al., 2011;Giri et al., 2019).The VP6 gene was considered of porcine origin although it shared the highest nucleotide identity (94.90-97.57%)with Indian human (RV1020, CMC_00038, NIV929893) strains because: i) the VP6 gene shared the highest nucleotide identity (96.15-96.73%)with Indian porcine (UP-Por30, UP-Por34) strains ii) The VP6 gene shared a low nucleotide identity (90.20-90.79%)with Wa strain (G1P[8]).These outcomes support the previous observations about the The phylogenetic tree based on amino acid sequences of porcine teschovirus polyprotein: The neighbour-joining tree was established using the pairwise distance model with 1000 bootstrap replicates in MEGA 11.0 software.The bar represents the genetic distance, while numbers indicate the bootstrap replicates (>75% are shown at branch points).The detected strains are marked with a black circle.potential porcine origin of the I1 VP6 gene in sporadic human strains, and it is more common in pigs (Papp et al., 2013b;Zhou et al., 2015).The low nucleotide and amino acid identities were observed for the VP4 and NSP3 genes.This reflects the lack of known close relatives to the study strain because of low surveillance and the full genome characterisation of porcine RVAs.The constant infections of G4P[6] strains in humans necessitate their surveillance in both humans and porcine to understand human-to-human transmission and its impact on the effectiveness of introduced vaccines.
In addition to the RVA, the study detected different RNA viruses like PAstV, enterovirus G, PSV, Aichivirus C, and PTV in pigs with diarrhoea but their number is mostly single, except AstV.The detection of multiple viruses in diarrheic piglets in the present study indicates that infection by one virus might be predisposed to infection by other viruses.This may be because of the capacity of rotaviruses to sustain the initial infection; rotaviruses are more adept at replicating in the host; and interference by rotaviruses leads to infection by other members of the virome (Doerksen et al., 2022).
The highest genera (4 out of 6) detected in the present study belong to the family Picornaviridae.The detection of diverse picornaviruses in pig stools is not unusual, as such coinfections have already been reported (Shan et al., 2011;Chen et al., 2018).There are multiple reasons for detecting picornaviruses compared to other enteric viruses in clinical samples.Among them, their excretion in large numbers, resistance to the environment, which increases survivability, and longer exposure periods for pigs increase their chances of detection.On one hand, the above features make them easier to detect, but on the other, they increase the bias of their detection in mixed infections.The present study has not only detected mixed infections but also recovered the full genome sequences of PAstV, EVG, PSV, Aichivirus C, and PTV, genetically characterizing them.The clinical significance of all the detected viruses is not elucidated, as they are detected in healthy and diseased animals.However, the complete genome sequences of the above viruses will facilitate the development of their diagnostics and the design of future epidemiological studies.
Among the porcine AstVs, PAstV1-5, PAstV4 is reported to have the highest prevalence in domestic pigs from the USA, Europe, and Asia (Folgueiras-Gonzaĺez et al., 2021).In the Indian pig population, all five PAstV lineages are in circulation, with a predominance of lineage 4, followed by lineage 2 (Kattoor et al., 2019;Kour et al., 2021).Earlier Indian studies amplified partial ORF1a and/or ORF2 genome regions; as a result, only partial PAstV sequences are available.However, the present study, employing a metatranscriptomic approach for the first time, contributed to the recovery of two whole genome sequences from Indian pigs.Overall, The phylogenetic tree based on amino acid sequences of porcine and bovine astrovirus ORF1a, ORF1b, and ORF2 proteins: The neighbour-joining tree was established using the pairwise distance model with 1000 bootstrap replicates in MEGA 11.0 software.The bar represents the genetic distance, while numbers indicate the bootstrap replicates (>75% are shown at branch points).The lineages to PAstV and Groups to BAstV were assigned according to previous reports (Ito et al., 2017;Zhu et al., 2021).The detected strains are marked with a black circle.
the present study detected five strains of the PAstV4 lineage and two strains of the PAstV2 lineage, representing a diversity of viruses in Indian pigs; however, the study aims not to investigate the prevalence of the virus.In the phylogenetic tree, PAstV4 study strains were part of the PAstV4 L4 lineage, and PAstV2 strains were part of the PAstV2 L1 lineage, consisting of Japanese and Chinese strains.However, the origins of AstV strains could not be elucidated as most of the available sequences in GenBank are from the USA, Japan, and China.RNA recombination is a driving force in the evolution, emergence, and virulence of AstVs (Zhu et al., 2021).The parents of the PAstV4 recombinant strain were identified from the USA and China (Jiangxi), whereas three parental strains of the PAstV2 recombinant strain were from Belgium and two provinces of China (Xinjiang and Sichuan).The recombination in AstV indicated that the pig trade and faecal-oral transmission promoted the recombination between strains from geographically different locations.
Neonatal calf diarrhoea is most commonly caused by RVA, and its co-infection has been reported with BAstV in Korea (Oem and An, 2014), China (Alfred et al., 2015;Zhu et al., 2021), and Italy (Martella et al., 2020).Although previous studies could not establish an association of AstV with diarrhoea, a recent metagenomics study found most of the BAstV reads in a diarrheic compared to the healthy Chinese calves (Lu et al., 2021).Another study on Chinese calves established a positive correlation between the presence of only BAstV and co-infection in diarrhoea (Zhu et al., 2022).In concurrence, the present study first time recovered BAstV suggesting that the major viral pathogen associated with calf diarrhoea might be AstV.However, the lack of complete genome sequences makes it difficult to explore the precise origin of the virus and the dynamics of interspecies transmission and zoonotic transmission.In addition to AstV, RVA reads were detected but contigs were not assembled; this may be because of the lower virus load in the samples, which was picked up by the RT-PCR, which has more sensitivity than NGS.
The validation of detected viruses was performed by virus isolation for porcine RVA until the fourth passage in MA104 cells, confirmed by sandwich ELISA and real-time quantitative RT-PCR (unpublished data, Supplementary File 2).Moreover, bovine RVA (Sawant et al., 2020a), astrovirus (Sawant et al., 2023), porcine enterovirus G, and porcine teschoviruses (Sawant et al., 2020b) were detected by RT-PCR using specific primers.
Nevertheless, the present exploratory study has significant limitations.First, only a few samples-two from bovine and porcine species from the same farms were analysed, resulting in the retrieval of identical viruses, thus restricting exploration of the diversity.The larger and broader sampling will likely demonstrate the temporal dynamics of microbial species.Second, healthy animals were not included in the baseline virome, which would have helped to delineate the changes in RNA virome after RVA infection.Thirdly, the study focused on only RVA-infected diarrhoea; hence, the role of viruses detected in disease could not be delineated.Lastly, validation by isolation of all the viruses was not possible because enteric viruses are difficult to isolate in cell lines, insufficient amount of samples, and cell lines supporting the growth of many detected viruses were not easily available.Nevertheless, the exploration of virome within a sample signifies the advent of a new age in diagnosis, which can discern unknown causes of diarrhoea and provide potent technical support for preventing and controlling diarrhoea.In summary, the study revealed a diversity of RNA virome in pigs while predominant astrovirus infection in calves with diarrhoea.The results emphasized that RVA-infected pigs are co-infected with diverse RNA viruses, favouring the emergence of new reassortant and recombinant viruses like RVA with the G4P[6] genotype and PAstV, respectively.In the future, such studies with large sample sizes covering different health conditions will play a seminal role in the surveillance of possible zoonotic pathogens at the human-animal interface.

TABLE 1
The diversity of sequences identified, their per cent identity, and coverage in porcine and bovine faecal samples.

TABLE 3
The heatmap showing comparative amino acid identities (%) of detected PAstV and BAstV with reference AstVs.Indicates the sequence is partial.The maximum and minimum identity of gene are colour coded as green and red, respectively.