Identification of Two New Isolates of Chilli veinal mottle virus From Different Regions in China: Molecular Diversity, Phylogenetic and Recombination Analysis

Chilli veinal mottle virus (ChiVMV) is an important plant pathogen with a wide host range, causing serious yield losses in pepper production all over the world. Recombination is a major evolutionary event for single-stranded RNA viruses, which helps isolates adapt to new environmental conditions and hosts. Recombination events have been identified in multiple potyviruses, but so far, there have been no reports of recombination events among the ChiVMV population. We here detected ChiVMV in pepper samples collected from Guangxi and Yunnan provinces for the first time and amplified the nearly full-length sequences. Phylogenetic and recombination analysis were performed using the new sequences and the 14 full-length and 23 capsid protein (CP) sequences available in GenBank. Isolates tend to cluster on a geographical basis, indicating that geographic-driven evolution may be an important determinant of ChiVMV genetic differences. A total of 10 recombination events were detected among the ChiVMV sequences using RDP4 with a strict algorithm, and both the Guangxi and Yunnan isolates were identified as recombinants. Recombination appears to be a significant factor affecting the diversity of ChiVMV isolates.


INTRODUCTION
Chilli veinal mottle virus (ChiVMV) is a member of the genus Potyvirus in the family Potyviridae. It is a very common virus in chilli pepper (Capsicum annum L.) in east Asian countries, causing serious losses in pepper production (Ong et al., 1980;Moury et al., 2005;Wang et al., 2006;Tsai et al., 2008;Shah et al., 2009). In addition to Capsicum annuum, ChiVMV can also infect many other plants in the family Solanaceae, including Nicotiana tabacum, Solanum lycopersicum, Solanum melongena, and Datura stramonium (Tsai et al., 2008;Ding et al., 2011;Yang et al., 2013;Zhao et al., 2014;Kaur et al., 2015). Symptoms of ChiVMV infection include mosaic mottling, twisted or fallen leaves, vein banding, and reduced fruit size (Tsai et al., 2008;Gao et al., 2016). The genome of ChiVMV is a single-stranded sense RNA, about 9.7 kb excluding the poly (A) tail. It encodes a polyprotein, which is then cleaved by virus-encoded proteases into 10 mature functional proteins (Yang et al., 2013;Gao et al., 2016). Aphis gossypii has been reported to transmit the virus in a non-persistent manner in solanaceous crops (Shah et al., 2008).
The public sequence databases currently contain the complete genome sequences of 14 ChiVMV isolates, all of which are from Asia. Several studies have reported genetic differences within the species based on analysis of CP sequences (Moury et al., 2005;Yang et al., 2013;Gao et al., 2016;Ahmad and Ashfaq, 2018) and there are currently 98 such sequences available. Discovering and determining the sequences of more isolates worldwide is important for our understanding of the molecular diversity and evolution of the virus. For single-stranded RNA viruses, recombination is a major evolutionary event that helps isolates adapt to new environmental conditions and hosts (Simon-Loriere and Holmes, 2011). Recombination events have been identified in many potyviruses (Revers et al., 1996;Gagarinova et al., 2008;Seo et al., 2009) but, so far, there have been no reports of recombination events among the ChiVMV population. In this study, we determined the nearly full-length sequences of ChiVMV in pepper from Guangxi and Yunnan provinces, China and used the data to analyze molecular diversity and recombination events among ChiVMV isolates.

Whole-Genome Sequencing of Two New ChiVMV Isolates
From May to July 2020, we collected pepper samples with suspected viral disease symptoms (including dead tops, mosaic, mottling, wrinkled leaves, and chlorosis) from pepper fields in Guangxi and Yunnan provinces in China. Total RNA was extracted from infected pepper fruits using the Trizol method, and first strand cDNA was synthesized using a reverse transcription kit (Toyobo) according to the manufacturer's instructions. The complete genome sequences of the two new isolates were amplified from five over-lapping fragments using specific primers (Supplementary Table 1). KOD neo enzyme (Toyobo) was used for PCR amplification, and the amplified fragments were purified with an Omega gel extraction kit and cloned into the pEASY-Blunt Zero vector (TransGen). At least two clones for each fragment were picked and sent for sequencing. Sequences were assembled using Vector NTI version 10. The complete genome sequences of the two isolates have been deposited in GenBank under accession numbers MT782116 (Guangxi) and MT974520 (Yunnan). Sequence analysis and comparison of the two new isolates to the other reported sequences were done using MEGA X (Kumar et al., 2018). The complete genome sequences and CP sequences of other ChiVMV isolates were downloaded from the National Center for Biotechnology Information (NCBI) (Supplementary  Tables 2, 3).

Construction of Phylogenetic Trees
The whole genome sequences of 16 ChiVMV isolates and 25 CPcoding region sequences were used for phylogenetic analysis in the MEGA X software package (Kumar et al., 2018). The bestfit nucleotide substitution models for the full-length sequences of 16 isolates and the 25 CP sequences were determined using the function in MEGA X to be, respectively, GTR + G + I (General Time Reversible + Gama Distributed With Invariant Sites) and T92 + G (Tamura 3-parameter + Gama Distributed). The trees were then constructed by the maximum-likelihood (ML) method according to the corresponding model with 1,000 bootstrap replicates. The sequence of the OKP41 isolate of pepper vein mottle virus (PVMV), a closely-related member of the genus Potyvirus, was used as an outgroup.

Recombination and Selection Pressure Analysis
The whole genome sequences of 16 ChiVMV isolates and 25 CP-coding region sequences were used for recombination analysis. The six methods in the RDP4 software, namely RDP, GENECONV, BOOTSCAN, MaxChi, Chimera, and SISCAN were used to find possible parental isolates and recombination breakpoints with the default parameters (Martin et al., 2015). Site specific selection pressure in CP coding sequences, was determined by single likelihood ancestor counting (SLAC), fixed effects likelihood (FEL), mixed effects model of evolution (MEME) with p-value threshold of 0.1, and fast unconstrained Bayesian approximation (FUBAR) with posterior probability of 0.9 implemented in the Hyphy package 1 (Pond and Frost, 2005).

Sequencing and Molecular Diversity Analysis of ChiVMV Isolates From Guangxi and Yunnan
By using specific primers, we amplified overlapping fragments by RT-PCR, and assembled the nearly complete genome sequences of the ChiVMV Guangxi and Yunnan isolates. The sequences were 9,722 (Guangxi) and 9,724 (Yunnan) nucleotides long excluding the 3 -end poly A tail, and both contained a predicted open reading frame of 9,270 nt, encoding a poly-protein of 3,089 amino acids. In comparisons of their full-length genome sequences, the new isolates had 79-92.6% (Guangxi) and 80.6-89.8% (Yunnan) nt identity to the previously reported isolates while the corresponding values for comparison with the outgroup PVMV control sequence were, respectively, 67.4 and 68% (Table 1 and Supplementary Table 4). The divergence of the Guangxi isolate is about 0.08-0.25 and that of the Yunnan isolate is about 0.11-0.23 (Table 1 and Supplementary Table 4).

Phylogenetic Relationships of ChiVMV Isolates Worldwide
A phylogenetic tree using all available full-length ChiVMV sequences with PVMV as an outgroup divides the isolates into two major clades (Figure 1). The first major clade has a sub-clade containing two isolates each from Hunan and Korea, and one isolate each from Guangdong, Wenchang, Guangxi (our isolate), and India. The second sub-clade is formed by two Indian and one Pakistani isolates. The second major clade contains three Sichuan isolates and one Yunnan tobacco isolate (Figure 1). The newly identified Yunnan pepper isolate was also included in the second clade. Thus our Guangxi and Yunnan isolates were more similar to the isolates from adjacent regions of China (such as Wenchang, Guangdong for Guangxi isolate, and Sichuan for Yunnan isolate), than to those from the more distant provinces. The capsid protein (CP) gene is very important for the infection cycle of potyviruses. Its primary function is to encode the virus coat protein (Revers and García, 2015) and this region has often been used as the basis for comparisons to establish the taxonomy of potyviruses (Ball, 2005;Tsai et al., 2008). To better understand the genetic variability of the ChiVMV population, we selected 25 CP coding region sequences from different geographical locations to construct a phylogenetic tree. The tree had two well-defined major clades (Figure 2). The first clade contains isolates from Indonesia, Thailand, India, Vietnam, South Korea and China (Guangxi, Hainan and Taiwan). Isolates from Yunnan, Liaoning, Sichuan of China and Pakistan constitute the second clade (Figure 2).

Recombination and Selection Pressure Analysis
To analyze possible recombination signals in the ChiVMV population, we used RDP4 software to predict recombination events among the full-length sequences. Recombination events identified by at least three methods and P value less than 1 × 10 −6 were considered credible and 10 recombination events were detected in total ( Table 2). The Guangxi and Yunnan isolates were predicted to be recombinants. The recombination event of the Guangxi isolate occurred between nts 2,756 and 5,284. The Hunan isolate and the Wenchang (Hainan Province) isolate were predicted to be possible parents. Interestingly, Hunan and Hainan are both neighboring provinces to Guangxi. The recombination of Yunnan isolate occurred between nts 31 and 1,400, and Pakistan isolate and Yunnan tobacco isolate were predicted as possible parents ( Table 2). The analysis indicated that geography-related recombination was likely a key factor in the evolution of ChiVMV.
We also detected two recombination events in the Yunnan tobacco isolate, at nt positions 9,316-9,739 and 27-1,696, respectively. One recombination event was detected in each of the three Indian isolates, GU170808, GU170807, and NC_005778. Three recombination events were detected in the MN207122-PK isolate, at various sites in the region 27-4,480 ( Table 2).
Whether there were recombination events in the 25 selected CP sequences from different geographical locations was further analyzed, and 3 recombination events were found (Supplementary Table 5). Two Chinese isolates and one Pakistan isolate were predicted to be recombinants, indicating that the recombination contributed to the variation of the CP sequences (Supplementary Table 5).
We also performed selection pressure analysis on the CP coding sequences (287 aa), and found that many of the codons are subject to negative selection. A total of 87, 136, and 160 negatively selected codons were detected in the CP region by SLAC, FEL, and FUBAR methods, respectively. The codons 83, FIGURE 2 | Phylogenetic tree based on the nucleotide sequences of the CP gene of 25 ChiVMV isolates. The phylogenetic tree was produced using MEGA X by the maximum-likelihood (ML) method using the Tamura 3-parameter algorithm and 1,000 bootstrap replications. The number at each branch of the phylogenetic tree is the bootstrap percentage. The tree was divided into two clades and the new Guangxi and Yunnan isolates are marked by green and red circles, respectively. The whole genome sequences of the two newly identified isolates and 14 ChiVMV full-length sequences extracted from public nucleic acid databases were used for recombination analysis. The six methods in the RDP4 software, namely RDP, GENECONV, BOOTSCAN, MaxChi, Chimaera, and SISCAN were used to find possible parental isolates and recombination breakpoints with the default parameters. Recombination events identified by at least three methods and P value less than 1 × 10 −6 are listed. For the geographical origin of the isolates, see Table 1. NS, Not Significant.

DISCUSSION
In this study, we detected ChiVMV in pepper disease samples collected from Guangxi and Yunnan provinces for the first time, and amplified the nearly full-length sequences of these isolates by overlapping PCR. Among the other complete ChiVMV sequences that have been made available, Guangxi isolate was least similar (79% nt identity) to the Yunnan tobacco isolate suggesting that the host imposes a selective effect on the evolution of the virus (Table 1). However, since the Yunnan tobacco isolate was predicted to be a minor parent of the Yunnan pepper isolate identified in our work (Table 2), the similarity between these two was slightly higher. The Yunnan pepper isolate has the lowest nt similarity with an isolate from India, with a value of 80.6% (Table 1). Published studies of the evolution and variation of ChiVMV isolates have usually been based on CP sequences (Moury et al., 2005;Tsai et al., 2008;Yang et al., 2013;Gao et al., 2016;Ahmad and Ashfaq, 2018). By constructing a phylogenetic tree from the full-length sequences, we found that the viral isolates from the same or similar regions tend to group together ( Figure 1). The analysis suggests that the evolutionary adaptation of the virus is driven by geographic location, which is consistent with the conclusion of Gao et al. (2016). However, in addition to Chinese isolates, we found that there were isolates from South Korea and India that were in the same clade as the Guangxi isolate (Figure 1). This may be due to the frequent trade of vegetables and ornamentals between China and these countries. Recombination is considered to be a significant source of plant virus genetic diversity (Simon-Loriere and Holmes, 2011). Recombination events have been identified in several potyviruses, including soybean mosaic virus, potato virus Y etc., (Revers et al., 1996;Gagarinova et al., 2008;Seo et al., 2009). Our analysis detected a total of 10 recombination events within the fulllength ChiVMV sequences, and the newly identified Guangxi and Yunnan isolates were both recognized as recombinants ( Table 2), indicating that recombination plays an important role in shaping the adaptability of the ChiVMV population. We also detected three recombination events among 25 selected CP sequences from different geographical locations, and the Guangxi isolate was predicted to be a minor parent of the Sichuan isolate (KF738253.1) (Supplementary Table 5).
The site-specific selection pressure analysis of the CP coding region using a variety of methods shows that codons at many sites are subject to negative selection, which is consistent with previous findings (Ahmad and Ashfaq, 2018), suggesting a strong negative or purifying selection in the ChiVMV population.
In conclusion, our study has determined the nearly complete genome sequences of two ChiVMV isolates from Guangxi and Yunnan provinces, China. Comparisons with other sequences have shown that the genetic differences among ChiVMV isolates were likely correlated with geographical location. Recombination occurs actively in the ChiVMV population and may be a force driving the adaptive evolution of the virus. A comparative analysis of the genome sequences of additional ChiVMV isolates would be helpful to give a clearer picture of genetic variability and evolution in this important virus.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
SR and FY: conceptualization. JC and FY: funding acquisition. SR, XC, and SQ: investigation. SR, JP, HZ, YL, and GW: methodology. WJ and YZ: resources. FY: supervision. SR: writing -original draft. FY: writing -review and editing. All authors contributed to the article and approved the submitted version.

FUNDING
This work was financially supported by the Chinese Agriculture Research System (CARS-24-C-04) and K. C. Wong Magna Fund in Ningbo University.