Robust Virome Profiling and Whole Genome Reconstruction of Viruses and Viroids Enabled by Use of Available mRNA and sRNA-Seq Datasets in Grapevine (Vitis vinifera L.)

Next-generation sequencing (NGS) based virome analyses of mRNA and sRNA have recently become a routine approach for reliable detection of plant viruses and viroids. In the present study we identified the viral/viroidal spectrum of several Indian grapevine cultivars and reconstructed their whole genomes using the publically available mRNAome and sRNAome datasets. Twenty three viruses and viroids (including two variants of grapevine leafroll associated virus 4) were identified from two tissues (fruit peels and young leaves) of three cultivars among which nine unique grapevine viruses and viroids were identified for the first time in India. Irrespective of the assemblers and tissues used, the mRNA based approach identified more acellular pathogens than the sRNA based approach across cultivars. Further, the mRNAome was on par with the whole transcriptome in viral identification. Through de novo assembly of transcriptomes followed by mapping against reference genome, we reconstructed 19 complete/near complete genomes of identified viruses and viroids. The reconstructed viral genomes included four larger RNA genomes (>13 kb), a DNA genome (RG grapevine geminivirus A), a divergent genome (RG grapevine virus B) and a genome for which no reference is available (RG grapevine virus L). A large number of SNPs detected in this study ascertained the quasispecies nature of viruses. Detection of three recombination events and phylogenetic analyses using reconstructed genomes suggested the possible introduction of viruses and viroids into India from several continents through the planting material. The whole genome sequences generated in this study can serve as a resource for reliable indexing of grapevine viruses and viroids in quarantine stations and certification programs.


INTRODUCTION
Grapevine (Vitis vinifera L.) is an important cash crop grown worldwide (McGovern, 2003). Being a clonally propagated crop, grapevine is amenable for coinfection by different viruses and viroids (Jo et al., 2018). It is reported to be susceptible to the largest number of acellular pathogens compared to other crop species (Beuve et al., 2018;Hily et al., 2018). Till date, more than 70 viruses and 7 viroids have been reported to infect grapevine (Singhal et al., 2019). Many a time, grapevine viruses deviate from the classical 'one pathogen -one disease' concept, i.e., interaction among more than one viral agent leads to disease development (Byrd and Segre, 2016).
Planting pathogen free crop propagules is of paramount importance in grapevine for increasing the productive life of vineyards (Kumar et al., 2015). Traditional detection techniques like ELISA, PCR and their variants are employed for indexing of a few selected viruses of grapevine while certifying the planting material for commercial planting. But these methods can only answer whether the pathogen(s) under investigation is present or not, leaving the status of all other untested viruses in the planting material unknown (Czotter et al., 2018). To assure the health of the planting material of grapevine that remains productive in the field for an average of 15 years in India, it would be necessary to subject it to rigorous indexing for all possible grapevine infecting viruses/viroids. For this, it would be essential to study the virome (total viral population) of the mother stock, the results of which can then be used for developing appropriate detection assays of all pathogens for screening the clonal propagules. Next-generation sequencing (NGS) approaches can provide us with a snapshot of the virome present in the propagule as they are effective not only in detecting the known viral pathogens and their variants, if any, but also in unravelling unknown one(s) (Jo et al., 2018). Among the various NGS based approaches, sRNA (sRNAome) and mRNA (mRNAome) sequencing are commonly used to reveal the virome of a given sample (Pantaleo et al., 2010;Pirovano et al., 2015;Jones et al., 2017;Maliogka et al., 2018;Pooggin, 2018;Massart et al., 2019). Recently, a few studies attempted to reveal the virome of different crops like grapevine, apple, and pepper from publically available mRNAome data (Jo et al., 2015(Jo et al., , 2016(Jo et al., , 2017. Both sRNA and mRNA pools can effectively capture single as well as double strand RNA viruses and some DNA viruses (Seguin et al., 2014;Roossinck et al., 2015;Jo et al., 2017). However, the relatively lower representation of viral RNA in the background of total plant RNA limits the use of mRNAome compared to sRNAome for viral detection (Beuve et al., 2018;Maliogka et al., 2018). As mRNA based methods can give longer contigs, they are more useful for variant detection, especially when significant genetic diversity exists as found in some of the grapevine viruses such as grapevine leafroll associated virus 3 (GLRaV3) (Xiao et al., 2019). Thus, it would be worthwhile to study the virome using both these methods for robust identification of entire virome of a plant species.
Though India grows grapevine on 137,000 hectares and exports 185,172 tonnes of grapes annually (FAOSTAT, 2017), only a few studies have been attempted to detect grapevine viruses in India. All these studies targeted only one/few virus(es)/ viroid(s) at a time using traditional detection methods (Kumar et al., 2012(Kumar et al., , 2013Sahana et al., 2013;Adkar-Purushothama et al., 2014;Rai et al., 2017;Marwal et al., 2019;Singhal et al., 2019). The current study is the first virome report of grapevines from India using sRNA and mRNA datasets of three Indian grapevine cultivars available in the public domain (Tirumalai et al., 2019) identifying a large number of viruses and viroids.

Plant Materials and Library Construction
Detailed information on plant materials and library construction is available in Tirumalai et al. (2019). In brief, total RNA was isolated from fruit peels (FP) and young leaves (YL) of three grapevine cultivars-Bangalore Blue (BB), Dilkush (DK), and Red Globe (RG). mRNA-seq and sRNA-seq libraries with two biological replicates, 24 in total, were constructed from isolated total RNA according to the NEXT flex Rapid directional mRNAseq bundle library protocol (Trapnell et al., 2012) and the TruSeq Small RNA Sample Preparation Guide (Illumina, San Diego, CA, United States) respectively. Sequencing was performed on the Illumina NextSeq500 platform which yielded 75 bp single end reads. Thus, a total, of 12 mRNA and 12 sRNA libraries obtained from two tissues (FP, YL) of three grapevine cultivars (BB, DK, RG) in two biological replicates were analyzed in the current study. The details of the materials used and the complete processing pipeline are indicated in Figure 1.
Raw Data Pre-processing and de novo Assembly of Pre-processed Reads The bioinformatics analyses were performed using Advanced Super Computing Hub for Omics Knowledge in Agriculture (ASHOKA) facility at ICAR-IASRI, New Delhi, India. Raw data of 24 libraries were downloaded from SRA database and converted to FASTQ files using the SRA toolkit version 2.9.6 (Leinonen et al., 2010). Three approaches were followed for de novo assembly. In the first approach, mRNA and sRNA libraries, 12 each, were individually assembled using Trinity version 2.5.1 (Grabherr et al., 2011) and CLC genomics workbench 12 de novo assembly tool, respectively with default parameters. For the second approach, combined mRNAome or sRNAome for each cultivar was obtained by aggregating corresponding mRNA or sRNA reads, respectively from four libraries (including two tissues and two replicates) of each individual cultivar. Similarly, whole transcriptome of each cultivar was obtained by aggregating both mRNA and sRNA reads from eight libraries of individual cultivar in the third approach. Trinity (k = 25), SPAdes (k = 21,23,25) version 3.13.1 (Bushmanova et al., 2019) and CLC (automatic word size = 20), Velvet (k = 17) version 1.2.10 (Zerbino and Birney, 2008) were used to assemble combined mRNAomes and sRNAomes, respectively while whole transcriptomes were assembled using SPAdes (k = 17,19,21) and Velvet (k = 21) assemblers.

Identification of Viruses and Viroids and Copy Number Estimation
All the assembled contigs were subjected to standalone MEGABLAST analysis (e-value cut off: 1e-5; query coverage: ≥ 80%) against the complete reference sequences of viruses and viroids 1 using NCBI blast+ version 2.9.0. Only contigs of greater than 50 (for sRNAome) and 200 nucleotides (mRNAome and whole transcriptome) were considered for analyses. To validate the viruses/viroids identified through assembly, the reads of each mRNA/ sRNA library were first mapped to the Vitis vinifera genome (GCF_000003745.3) using CLC workbench mapping tool with default parameters (match score-1, mismatch cost-2, length fraction-0.5, similarity fraction-0.8). The unmapped reads were then analyzed using MEGABLAST algorithm (e-value cut off: 1e-5; query coverage: ≥80%) against the reference genomes of viruses and viroids. Only those viruses/vioids that were detected through assembly (from sRNAome/mRNAome/whole transcriptome) and BLAST analysis of reads from at least two libraries (derived from the particular nucleic acid pool from which the contigs were obtained) of the corresponding cultivar were considered. To 1 http://www.ncbi.nlm.nih.gov/genome/viruses/ arrive at the copy number for a virus/viroid, the number of reads associated with either RdRp ORF (in case of viruses that use sub-genomic RNA (sgRNA) for translation) or the entire polyprotein [in grapevine fleck virus (GFkV), grapevine rupestris vein feathering virus (GRVFV)] or the entire genome (in viroids) was multiplied with 75 (for mRNA)/ 24 (for sRNA) followed by division with the size (bp) of the corresponding genomic region of the virus/viroid. Intact mRNA reads were used for copy number estimation while the pre-processed reads were used in case of sRNA. The average length of pre-processed sRNA reads in all libraries was near to 24. Hence the factors 75 and 24 were used for mRNA and sRNA libraries, respectively. As a reference genome for grapevine virus L (GVL) was not available in NCBI, we included the de novo assembled GVL genome of the present study (that was identified by performing BLASTn analysis of larger contigs against "non-redundant" (nr) (NCBI) database) for MEGABLAST analysis.

Reconstruction of Whole Genomes of Viruses and Viroids
Virus/viroid associated contigs were filtered from the total contigs using SAM tools version 1.9 (Li et al., 2009). The detailed procedure followed for genome reconstruction is given in Supplementary Figure S1. In brief, the Trinity assembled longer contigs from combined mRNAomes were examined for the presence of intact viral/viroidal genome. Further, the SPAdes assembled longer contigs from combined mRNAomes and whole transcriptomes were examined followed by inspection of Trinity assembled larger contigs in individual mRNA libraries. Next, the Trinity assembled viral/viroidal contigs from combined mRNAomes were mapped against the NCBI designated reference genomes of identified viruses and viroids (CLC workbench mapping tool). In cases where the Trinity assembled contigs were insufficient to reconstruct the entire genome, SPAdes assembled contigs from combined mRNAomes and whole transcriptomes were supplemented during mapping. Still, if the genome could not be obtained, the most closely related genome was used as reference during mapping. The full length consensus sequence, if obtained, after mapping/directly by de novo assembly was considered as the complete/near complete genome for a particular virus/viroid. To find ORFs in assembled viral genomes, we used NCBI ORF finder 2 .

Pairwise Distance and Phylogenetic Analyses
The complete genomes retrieved from NCBI along with the viral/viroid genomes reconstructed in this study were aligned using CLUSTALW tool in MEGA7 software version 7.0.26 (Kumar et al., 2016). Aligned sequences were subjected to pairwise distance analysis and phylogenetic tree construction using neighborhood joining (NJ) method and Kimura 2parameter (K2P) model with 1000 bootstrap replicates. For grapevine geminivirus A (GGVA), grapevine latent viroid (GLVd), grapevine leafroll associated virus 4 (GLRaV4), grapevine virus B (GVB), GVL, grapevine rootstock stem lesion associated virus (GRSLaV) and GRVFV, all the respective complete genomes available in NCBI were used for analysis. Owing to the availability of a large number of genome sequences for GLRaV3, only those sequences showing 100% query coverage in BLASTn analysis against nr (NCBI) database were taken for analysis. Similarly, in cases of Australian grapevine viroid (AGVd), grapevine yellow speckle viroid-1, -2 (GYSVd1, GYSVd2), and hop stunt viroid (HSVd), only 10 non-redundant genomes that were highly similar to each isolate of a viroid were used. In all the cases, an outgroup (except for pairwise distance analysis) and the NCBI designated reference genome (except GVL, for which there is no designated reference sequence) were included.

Single Nucleotide Polymporphism (SNP) Analyses
The host unmapped reads of individual cultivars were mapped against the complete/near complete viral/viroid genomes assembled from the corresponding cultivar using the mapping tool available in CLC workbench using default parameters (match score-1, mismatch cost-2, length fraction-0.5, similarity fraction-0.8). For SNP detection, the mapped files were subjected

Recombination Analyses
Using CLUSTALW aligned MEGA file as input, recombination analysis was performed using RDP4 package version 4.39  employing nine different algorithms. Only recombination events detected by at least five algorithms in the reconstructed viral genomes were considered. Only viral sequences used for phylogenetic analyses were used for detection of recombinants.

Pre-processing of Raw Data
The number of raw reads ranged from 10.5 to 40.2 million with an average of 23.3 million for mRNA and 2.9 to 8.3 million with an average of 4.3 million for sRNA libraries ( Table 1). As mRNA reads were of acceptable quality (without adapter sequences; phred-score > 20), we proceeded directly for de novo assembly while sRNA reads were filtered to remove adapter sequences and poor quality reads (quality scores < 0.05).

Identification of Viruses and Viroids From Grapevine mRNAome and sRNAome
We identified more viruses and viroids from mRNAome (23) than sRNAome (7) across cultivars and tissues (Supplementary Table S1). The only exception for this was the FP-specific sRNA datasets of cv. DK which identified six viruses and viroids while the corresponding mRNAome could identify only five. Among the two tissues, relatively more viruses/viroids were identified in FP than YL in all cultivars except DK from mRNA libraries. However, nearly similar number of viruses/viroids was identified from sRNA libraries across tissues and cultivars (Supplementary Figure S2 and Supplementary Table S1). Combined mRNAome assembly using Trinity identified the same number of viruses and viroids (23) across cultivars as compared to the individual mRNA libraries (23). However, on a closer look, we found that combining the reads of the two tissues of each cultivar did offer some advantage in case of mRNAome, since additional virus(es)/viroid(s) were identified in BB (1), DK (2) and RG (1). The only exception to this is GVE that was detected in individual mRNA libraries but not in combined mRNAome (Figures 2A,B, Supplementary Figures S3A,B, and Supplementary Table S2). Similarly, combined sRNAome assembly using CLC was more effective as it could identify two unique viruses (GVF in DK and GRSLaV in RG) in addition to the seven viruses and viroids identified by the individual library approach across cultivars (Figures 2D,E, Supplementary  Figures S4A,B, and Supplementary Table S3). Between the combined sRNAome and combined mRNAome, the former could identify only a fraction of viruses and viroids (12) identified by the latter even after accounting for the viruses and viroids identified by all the assemblers. Interestingly, from combined mRNAomes and whole transcriptomes exactly the same number of viruses/viroids was identified in BB, DK and RG cultivars (6, 10, and 21), representing a total of 23 viruses/viroids though the identities of a few differed in cvs. DK and RG. The identified acellular pathogens included 14 grapevine viruses (including two GLRaV4 variants), four mitoviruses and five viroids -Alternaria alternata chrysovirus 1 (AaCV1), Alternaria arborescens mitovirus 1 (AaMV1), AGVd, Erysiphe necator mitovirus 1 (EnMV1), Erysiphe necator mitovirus 3 (EnMV3), GFkV, GGVA, GLVd, GLRaV3, grapevine leafroll associated virus -4, -5, -6 (GLRaV4, GLRaV5, GLRaV6), GVA, GVB, GVL, GVE, GVF, GRSLaV, GRVFV, GYSVd1, GYSVd2, HSVd and tobacco streak virus (TSV  Figures S4B,C). In case of whole transcriptome assembly, SPAdes identified more viruses (2, 3, and 6 additional viruses/viroids in cvs. BB, DK, and RG, respectively) and viral contigs as compared to Velvet in all cultivars. Notably Velvet based assembly failed to identify HSVd from any whole transcriptome, or GYSVd2 from DK or GYSVd1,2 from RG, whereas SPAdes identified HSVd in BB and RG, and GYSVd2 in DK and GYSVd1,2 in RG from the corresponding whole transcriptomes. However, Velvet did detect HSVd in each of the combined sRNAomes and GYSVd2 from DK and GYSVd1,2 from RG ( Figures 2G,H, Supplementary  Figures S5A,B, and Supplementary Tables S3, S5).

Copy Number Estimation for Identified Viruses and Viroids in Each mRNA and sRNA Library
The number of host unmapped mRNA and sRNA reads ranged from 0.41 to 1.80M and 0.18 to 0.70M, respectively across libraries. Though the number of host unmapped reads was higher (0.98M) in case of mRNA compared to sRNA (0.41M), the proportion of unmapped reads to total reads was higher in the latter (9.75%) than the former (4.27%). On average, 2.06 and 0.02% of host-unmapped reads from mRNA and sRNA libraries mapped to viral/viroidal genomes (Supplementary  Figures S6A,B and Table 1). In general, the proportion of virus/viroid associated reads was relatively higher in mRNA libraries constructed from FP than YL while no such trend was observed in case of sRNA libraries. Based on copy number estimates, HSVd (94-100%) predominated in cv. BB in both mRNA and sRNA libraries. In case of mRNA libraries of cvs. DK and RG, HSVd and GYSVd2 were predominant in FP and YL, respectively. In sRNA libraries of cv. DK and in all but one sRNA libraries of cv. RG, GYSVd2 was predominant irrespective of tissue type (Supplementary Figures S7A,B). Further, both the replicates in each tissue of a cultivar were highly similar not only in detecting the viromes but also in estimating their copy number.

Viral/Viroid Genome Reconstruction From de novo Assembled Contigs
By mapping, the viral/viroid associated contigs from combined mRNAome and whole transcriptome of each cultivar against the NCBI designated reference genomes of identified viruses and viroids we obtained complete or near complete (>99%) genomes of 15 viruses and viroids from three cultivars ( Table 2). Some other viral contigs could not be assembled into full genomes using the reference genomes as scaffolds. For the assembly of GLRaV3 and GLRaV4 genomes from cv. RG, the longest Trinity assembled contig of each virus was first blasted against the nr (NCBI) database. The complete genome of the most highly similar isolate was then used as a reference during mapping in each case. Trinity assembly of library RGFPR2 directly yielded the whole genome of GVB. Similarly, we obtained GVL genome from one of the Trinity assembled longest contigs from combined mRNAome of cv. RG through BLAST against nr (NCBI) database. In total, we obtained 19 complete/near complete viral/  viroid genomes from three cultivars ( Table 2). Trinity yielded relatively longer contigs for most viruses and viroids as compared to SPAdes in all cultivars with mRNA reads (Supplementary  Figures S8, S9). On the contrary, SPAdes yielded relatively longer viral/viroid contigs as compared to Velvet in most instances when whole transcriptomes were assembled (Supplementary  Figures S10, S11). Though Velvet assembled more viral/ viroid contigs from combined sRNAomes, CLC yielded longer contigs for most viruses and viroids as compared to Velvet (Supplementary Figures S12, S13). However, we could not reconstruct any viral genome using contigs assembled from combined sRNAomes. From the reconstructed complete/near complete viral genomes, we could identify all of the anticipated ORFs for all recovered viruses using NCBI ORF finder (Supplementary Table S6). Failure to identify intact ORFs in nearly complete genomes that could be assembled to the tune of >95% (Supplementary Table S7) were still deemed incomplete.

Pairwise Distance and Phylogenetic Analyses Using Reconstructed Viral/ Viroid Genomes
Each of the complete/near complete genomes obtained were subjected to pairwise distance (Supplementary Tables S8-S19) and phylogenetic analyses (Figures 4A-L) along with related complete genomes retrieved from NCBI, and the most closely related genomes are indicated here, including their country of origin.

SNP Detection and Recombination Analyses in Reconstructed Genomes
A large number of SNPs was detected for RG GRVFV (168) followed by RG GLRaV3 (117), that were equally distributed throughout the genome, while no SNP was detected in case of RG GGVA, RG GLVd, RG GYSVd2, DK AGVd, and BB GYSVd1. Other viruses that had a good number of SNPs included GLRaV3 from cv. DK (102) and GLRaV4 (100), GVL (64), and GVB (40) from cv. RG ( Figure 5A).
The reconstructed viral genomes, after alignment, were subjected to detection of recombination events. Among the eight reconstructed viral genomes, recombination events supported by at least five algorithms were detected in only three genomes. In GLRaV3 genomes of DK and RG, a similar recombination event was detected in 5 region of the genome. An additional recombination event was detected in 3 region of RG GLRaV3. For RG GLRaV4, we found only one recombinant sequence at 3 region ( Figure 5B and Supplementary Table S20).

DISCUSSION
In this study, viromes of three Indian grapevine cultivars were determined and some of their whole genomes were reconstructed from publically available mRNAome and sRNAome datasets (Tirumalai et al., 2019). Since the materials used in the present study were obtained from Indian Institute of Horticultural Research (in Bangalore, India) one of the leading grapevine breeding centers in the tropical region (Tirumalai et al., 2019), it is the most appropriate one for performing virome analysis as all the vegetative propagules derived from the breeding stock would be expected to be infected with the same viruses. Interestingly the cv. RG, an introduction from California had the maximum viral load in our study compared to the native cvs., BB and DK.
Uneven distribution of viruses and viroids across tissues of a perennial plant like grapevine (Kominek et al., 2009), suggested that sampling different tissues will reveal a more accurate sanitary status of a plant. We also found pooling samples from different tissues was more reliable than relying on individual tissue for virome analysis. Earlier, Jo et al. (2015), also reported the superiority of tissues-combined assemblies over the individual ones. We further observed that the combined mRNAome and whole transcriptome identified nearly similar acellular pathogens and both these approaches were more sensitive than individual or combined sRNAomes. This might be because of the smaller size and number of reads generated from sRNA libraries. Contrary  to the observation of Maliogka et al. (2018), the proportion of viral and viroidal reads in mRNA libraries was higher than sRNA libraries in our study. This might be due to the fact that viral sRNAs are produced only upon activation of host's antiviral defense while mRNAomes can even detect viruses and viroids that are unrecognized by the host . Further, similar number of viruses and viroids were identified by Trinity and SPAdes assemblers from mRNAomes and CLC and Velvet assemblers from sRNAomes. However, SPAdes outperformed Velvet in case of whole transcriptomes. So, when more than one assembler was used, one or more viruses that escaped detection by one assembler could be detected by the other (Massart et al., 2019). Thus, use of multiple tissues and assemblers enabled better unraveling of grapevine virome.
In the present study, we identified 19 grapevine viruses and viroids (including two variants of GLRaV4) and four mycoviruses associated with the grapevine fungal pathogens-Erysiphe necator and Alternaria spp. (Kakalikova et al., 2009;Feng et al., 2018). Included among these is GRSLaV, which was earlier reported as a novel virus from California in cv. RG. This indicates the possible introduction of GRSLaV from California along with the RG propagule. However, GRSLaV is now regarded as a strain of grapevine leafroll associated virus 2 (GLRaV2) (as GLRaV-2RG) (Alkowni et al., 2011). Nonetheless, this is the first study that could successfully detect GLRaV-2 or any of its variants in India. Though Kumar et al. (2013) did attempt to detect this virus in India they could not succeed rather they detected GLRaV1 and GLRaV3. Further, GLRaV5 and GLRaV6 are presently regarded as the strains of GLRaV4 (Rai et al., 2017). On this basis, nine grapevine viruses and viroids (GGVA, GLRaV2, GRVFV, GVA, GVE, GVF, GVL, TSV, and GLVd) were detected for the first time in grapevine cultivars grown in Indian soil. Interestingly, we could identify GVL, the reference for which is not yet available in the NCBI, using the GVL genome obtained in this study.
Of the 19 complete/near complete genomes (>99% completion but <100%) obtained in this study, seven viral (including four genomes with > 13 kb) and 1 viroidal genome were recovered for the first time from any Indian grapevine cultivar. None of the viral whole genomes could be recovered from combined sRNAome assembled contigs as reported by Baranwal et al. (2015) and Jo et al. (2016). However, this might be due to the use of lower number of sRNA reads (approximately one-fifth) as compared to the mRNA reads in the current study. Identification of DNA viruses in mRNAome is rare and construction of their whole genome is still scarce (Jo et al., 2017), but we could not only identify GGVA in mRNA of cv. RG but could also reconstruct its genome in entirety with 2905 nucleotides. Initially, the RG GVB and GVL genomes could not be recognized as the former diverged significantly (23%) from the reference genome while there was no reference genome for the latter. However, inspection of Trinity assembled longer contigs of individual and combined mRNA libraries through BLAST analysis against nr (NCBI) database coupled with ORF prediction, identified the whole genomes of these isolates. Complete genomes could not be reconstructed for RG GLRaV3, 4 isolates using reference-based mapping because of their divergence (2.7 and 9.6% divergence of RG GLRaV3, 4 genomes, respectively) from the corresponding reference genome, though their near complete genomes could be reconstructed using the closely related genomes. Thus, examination of larger contigs assembled by various de novo assemblers coupled with usage of increased number of reference genomes of a virus during mapping could increase the chances of whole genome recovery. Identification of large number of viral SNPs in this study ascertains the quasispecies nature of plant viruses (Jo et al., 2018). Hence, the complete/near complete genomes reconstructed in this study were the consensus of viral variants present in a given cultivar.
We followed Jo et al. (2015) for copy number estimation except that we considered reads from only the non sgRNA region to reflect the true abundance of viruses that use sgRNA strategy for translation. Phylogenetic and distance matrix analyses revealed the divergence of AGVd, HSVd and GYSVd1 isolates obtained from different cultivars while the GLRaV3 and GYSVd2 isolates from cvs. DK and RG were related. Recombination analyses revealed that the RG GLRaV3, DK GLRaV3, and RG GLRaV4 isolates were recombinants of global isolates. Owing to the vegetative propagation of grapevine and free trade of planting materials, viruses and viroids can easily spread globally. In addition, coinfection of a single plant with numerous isolates of same/different viruses offers scope for recombination among different isolates (Jo et al., 2016).
Plants do not always express symptoms associated with every virus/viroid present, hence symptomology and individual virus/viroid based detection assays are not sufficient to determine the full spectrum of viruses/viroids present in a plant. Rather, use of available or newer transcriptome datasets is a better choice for profiling of viromes that can serve as a reliable base for indexing of planting materials in plant quarantine stations and during certification.

DATA AVAILABILITY STATEMENT
The datasets analyzed in this study are available in the NCBI repository under the Bioprojects PRJNA421907 (sRNA-seq) and PRJNA421908 (mRNA-seq). The whole genomes of 19 viruses and viroids reconstructed in this study have been submitted to GenBank (MN662228 to MN662245 and MN661401).

AUTHOR CONTRIBUTIONS
VS, AS, SJ, and VB conceptualized and formulated the study and read and approved the final manuscript. VS performed the bioinformatics analyses. VS and AS have drafted the manuscript. VB and SJ edited the manuscript.