Rescuing eGFP-Tagged Canine Distemper Virus for 40 Serial Passages Separately in Ribavirin- and Non-Treated Cells: Comparative Analysis of Viral Mutation Profiles

Due to lacking a proofreading mechanism in their RNA-dependent RNA polymerases (RdRp), RNA viruses generally possess high mutation frequencies, making them evolve rapidly to form viral quasispecies during serial passages in cells, especially treated with mutagens, like ribavirin. Canine distemper virus (CDV) belongs to the genus Morbillivirus. Its L protein functions as an RdRp during viral replication. In this study, a recombinant enhanced green fluorescence protein-tagged CDV (rCDV-eGFP) was rescued from its cDNA clone, followed by viral identification and characterization at passage-7 (P7). This recombinant was independently subjected to extra 40 serial passages (P8 to 47) in ribavirin- and non-treated cells. Two viral progenies, undergoing passages in ribavirin- and non-treated VDS cells, were named rCDV-eGFP-R and -N, respectively. Both progenies were simultaneously subjected to next-generation sequencing (NGS) at P47 for comparing their quasispecies diversities with each other. The rCDV-eGFP-R and -N showed 62 and 23 single-nucleotide mutations (SNMs) in individual antigenomes, respectively, suggesting that the ribavirin conferred a mutagenic effect on the rCDV-eGFP-R. The spectrum of 62 SNMs contained 26 missense and 36 silent mutations, and that of 23 SNMs was composed of 17 missense and 6 silent mutations. Neither the rCDV-eGFP-R nor -N exhibited nonsense mutation in individual antigenomes. We speculate that the rCDV-eGFP-R may contain at least one P47 sub-progeny characterized by high-fidelity replication in cells. If such a sub-progeny can be purified from the mutant swarm, its L protein would elucidate a molecular mechanism of CDV high-fidelity replication.


INTRODUCTION
Canine distemper is a severe infectious disease, affecting a broad variety of domestic and wild carnivores (McCarthy et al., 2007). Its etiological agent is canine distemper virus (CDV), also known as canine morbillivirus, classified into the genus Morbillivirus in the family Paramyxoviridae. Canine distemper virions are enveloped and pleomorphic particles containing singlestranded RNA with negative polarity. CDV generally has a genome length of 15 690 nucleotides (nt), following the ''rule of six'', required for the efficient replication between the genome and antigenome (Kolakofsky et al., 2005).
A CDV genome contains six transcriptional units, independently encoding six structural proteins (N, P, M, F, H and L proteins). Additionally, CDV codes for two nonstructural proteins (V and C proteins). The V protein is expressed via an RNA editing strategy from a P gene transcription unit (Mahapatra et al., 2003). All genes are arranged in an order of 3'-N-P/V/C-M-F-H-L-5' in the CDV genome. Six open reading frames (ORFs) are separated by untranslated regions (UTRs) with variable lengths. The L protein is an RNA-dependent RNA polymerase (RdRp), which is the largest of the virus proteins and is also the least abundant. It is assumed to carry all activities necessary for genomic RNA transcription and replication, as well as to be able to cap, methylate and polyadenylate viral mRNAs (Barrett et al., 2006). The morbilliviral L protein can only perform its function as the RdRp, when it associates with its co-factor, the P protein.
RNA viruses generally possess relatively high mutation frequencies in nature, mainly attributed to the lack of proofreading mechanisms in their RdRps (Mandary et al., 2019). Morbilliviruses are no exception . Due to the low-fidelity characteristics of morbilliviral RdRp, random mutations would unavoidably occur during replication between the genome and antigenome. Single-nucleotide mutations (SNMs) are theoretically restricted only in a small number of sites during the first few passages of morbillivirus in cells, whereas the continuous proliferation of progenies undoubtedly results in more complex mutant spectra, namely viral quasispecies. Viral quasispecies can be defined as the mutant distributions that are generated upon replication of RNA viruses in vitro and in vivo (Andino and Domingo, 2015). Owing to the absence of proofreading mechanisms in their RdRps, RNA viruses can be regarded as ideal models for experimentally addressing key questions in their dynamics of replication and evolution (Arbiza et al., 2010;Sardanyes and Elena, 2011).
It is not easy to determine accurately the quasispecies diversity in viral progenies. For example, Sanger sequencing as a conventional method is incapable of creating a large-scale dataset to identify all SNMs in viral mutants . The advent of next-generation sequencing (NGS) technique offers much potential for analyzing thousands of viral sequences from a given host, noticeably improving our ability to quantify within-host sequence diversity in viral infections (Nelson and Hughes, 2015). As an alternative technique, the NGS has been widely applied to estimate the exceptionally-high diversity within viral quasispecies. By means of it, single-nucleotide polymorphisms (SNPs) in various viruses can be systematically analyzed for revealing their evolutionary dynamics (Campo et al., 2014;Hasing et al., 2016;Ni et al., 2016;Yang et al., 2018;.
The purine nucleoside analog, ribavirin, has a broadspectrum activity against viruses (Graci and Cameron, 2006), including CDV (Elia et al., 2008;Carvalho et al., 2014;Lanave et al., 2017). Ribavirin can exert antiviral activity by increasing the error rate of viral genome replication. Consequently, a gradual accumulation of lethal mutations would lead to a dramatic reduction in replication ability of viral progenies due to error catastrophe. Previous studies demonstrated that RNA viruses cultured in ribavirin-treated cells could evolve to generate ribavirin-resistant variants, characterized by increased fidelity in their RdRps (Pfeiffer and Kirkegaard, 2003;Zeng et al., 2013;Tian and Meng, 2016;Griesemer et al., 2017;. Therefore, the ribavirin is an ideal mutagen used to screen for high-fidelity variants for revealing a key molecular mechanism in viral RdRps. Reverse genetics technique is broadly used to rescue recombinant CDVs for expressing foreign proteins, e.g., enhanced green fluorescence protein (eGFP). Recombinant eGFP-tagged CDV has been demonstrated to be a useful tool for tracing virus infection (Ludlow et al., 2012). We have previously constructed two reverse genetics platforms for different CDV strains. Both platforms facilitate recovery of marker-tagged (Liu et al., 2020) or antigen-expressing  recombinant CDV. In this study, a recombinant eGFP-tagged CDV (rCDV-eGFP) was rescued, identified, characterized and then used as a model virus for extra forty passages separately in ribavirin-and non-treated cells. The NGS technique was used for uncovering profiles of genomic mutation in ribavirin-and non-screened progenies.

Test Strip Detection of rCDV-eGFP
Culture supernatant with rCDV-eGFP-infected cells was harvested at P7 for a single freeze-and-thaw cycle, followed by detection using a test strip of CDV infection (Mensall ® , Suqian, China), according to the manufacturer's instruction. Noninfected cell culture was also analyzed as a control.

Growth Kinetics of rCDV-eGFP
Growth kinetics of the P7 rCDV-eGFP was compared with that of the wt-CDV in vitro, as described previously (Liu et al., 2020). Briefly, VDS cells were plated into five 12-well plates (1.5×10 6 cells/well, and 6 wells/plate) for incubation at 37°C for 2 h. The rCDV-eGFP and wt-CDV was separately inoculated (MOI = 0.001) into all plates (3 wells/sample) for incubation at 37°C for 2 h, and then supernatants were replaced with DMEM for further incubation at 37°C. At 0, 24, 48, 72 and 96 h post infection (hpi), a plate was randomly removed from the incubator, and subjected to a single freeze-and-thaw cycle to collect supernatant for viral titration by TCID 50 assay. The viral titer for each sample was calculated by the Spearman-Kärber equation (Finney, 1952).

Extra Forty Serial Passages of rCDV-eGFP
The P7 rCDV-eGFP was independently passaged (3 d/passage) in ribavirin-and non-treated VDS cells for 40 serial passages (P8 to P47) further. The 50% cytotoxic concentration (CC 50 ) value of ribavirin had been measured at 6.1 mM for the VDS cell line, and additionally the ribavirin had been determined for a 50% effective concentration (EC 50 ) value of 432 µM against another reporter-tagged CDV (5804P strain) (Liu et al., 2020). Therefore, to make the rCDV-eGFP gradually adapt to the ribavirin-treated cells, ribavirin concentration gradually increased in DMEM with passaging: 240 µM from P8 to 13, 320 µM from P14 to 22, 360 µM from P23 to 24, 400 µM from P25 to 29, and 440 µM from P30 to 47. Two rCDV-eGFP progenies, undergoing serial passages in ribavirin-and non-treated VDS cells, were named rCDV-eGFP-R and -N, respectively.

NGS of rCDV-eGFP-R and -N at P47
Culture supernatants of rCDV-eGFP-R-and -N-infected cells were separately harvested at P47 for extracting total RNAs using the Viral RNA/DNA Extraction Kit (Takara, Dalian, China). The RNA samples were reverse transcribed by random hexamers using the HiScript ® 1st Strand cDNA Synthesis Kit (Vazyme, Nanjing, China), according to the manufacturer's instruction. The Illumina sequencing and library construction were performed as described previously . In brief, the NEBNext ® Ultra ™ II RNA Library Prep Kit (NEB, Ipswich, MA, USA) was used for library construction. After adapter ligation, ten cycles of PCR amplification were performed for sequencing target enrichment. The libraries were pooled at equal molar ratio, denatured and diluted to optimal concentration prior to sequencing. The Illumina NovaSeq 6000 (Illumina, San Diego, CA, USA) was used for sequencing to generate pair-end 150 bp reads.

Competent rCDV-eGFP Is Recovered From Its cDNA Clone
The rCDV-eGFP cDNA clone was co-transfected with three helper plasmids for rescuing the rCDV-eGFP. A small number of plasmid-transfected cells had begun to emit green fluorescence ( Figure 1B) at 72 h post transfection (hpt). The cell monolayer was digested with trypsin for further co-cultivation with VDS cells. The rescued rCDV-eGFP was subjected to serial blind passages in VDS cells. Fluorescent syncytium formation was always observable during passaging ( Figure 1B).

The rCDV-eGFP Is Identified by RT-PCR and Test Strip
Total RNA was extracted from rCDV-eGFP-infected cell culture at P7 for RT-PCR analysis to confirm the viral identity. An expected band of amplicon size (1001 bp) was observed only on the RT-PCR lane ( Figure 1C). As a control, PCR detection revealed no plasmid residue of cDNA clone affecting RT-PCR analysis ( Figure 1C). The Sanger sequencing showed that the P7based RT-PCR product was identical to the 1001-bp-long sequence. Additionally, rCDV-eGFP-and non-infected culture supernatants were detected by test strips, indicating only the former with a positive result ( Figure 1D).
The rCDV-eGFP Has Similar Growth Kinetics to That of the wt-CDV Growth kinetics of rCDV-eGFP at P7 was compared with that of wt-CDV during the 96-h period of viral culture. Syncytia induced by both viruses were observable at 24 hpi, and exacerbated over time. Both viruses exhibited similar growth kinetics during 72 hpi ( Figure 1E), but a significant difference at 96 hpi.

NGS Shows Analyzable Sequencing Depths
The rCDV-eGFP had a 16536-nt-long recombinant genome. Figure 2A schematically showed all ORFs and UTRs in proportion to their actual distributions in the viral antigenome.
To uncover mutation profiles of rCDV-eGFP-R and -N at P47, viral samples were subjected to NGS analysis. The complete NGS data were deeply analyzed using bioinformatic tools, yielding acceptable sequencing depths. The average depths were 195× and 93× for rCDV-eGFP-R ( Figure 2B) and -N ( Figure 2C) antigenomes, respectively. Two samples were determined to have an approximately 99.9% of coverage range across the fulllength antigenome sequence. Uncovered regions were located only at 5′-and 3′-end regions in the antigenome.

Twenty-Six Single-Amino Acid Mutations Are Identified in rCDV-eGFP-R
Twenty-eight SNMs led to 26 SAAMs in 6 proteins of the rCDV-eGFP-R at P47 ( Figure 4C). Out of these 26 SAAMs, 24 directly resulted from their individual SNMs, and the other two (K2156R and K2157G) were attributed to the complex mutations (A16342G, A16343G, A16344G and A16345G) in the L ORF.
Mutation frequencies of SAAMs were enclosed within brackets in Figure 4C. There were 5 amino acid sites with mutation frequency of 100%. Out of the 7 proteins, only the N protein had no SAAM, and the L protein showed the most SAAMs but all at low mutation frequencies (6.04 to 17.52%). The eGFP as a foreign protein displayed only one SAAM (N147D) in it.

DISCUSSION
A rescue system for CDV was reported as early as 2000 (Gassen et al., 2000). CDV is an effective vector to express foreign proteins (Parks et al., 2002;Plattet et al., 2004;Ludlow et al., 2012;Wang et al., 2012;Liu et al., 2014;Chen et al., 2019). We had established previously the reverse genetics platform of CDV 5804P strain (Liu et al., 2020). In the present study, we used this platform to rescue the rCDV-eGFP, in attempting to use the eGFP as a fluorescent reporter for independently unveiling viral evolutionary patterns under the mutagen-and non-treated circumstances. The rCDV-eGFP had a similar growth curve to that of the wt-CDV during the 72-hpi period ( Figure 1E), suggesting no significant interference of eGFP with viral growth. Viral self-proteins are intrinsically expressed in CDV-infected cells. Harmful or even lethal SNMs were an inevitable event with viral passaging, but would not accumulate in viral self-sequences for avoiding the impact of error catastrophe on virus propagation. In contrast, SNMs should be unrestricted in the eGFP ORF during viral replication, owing to the eGFP as a foreign protein, which as such is hardly involved in a series of CDV-related events, e.g., regulation and packaging. SNMs would be random, uncontrolled and retainable in the eGFP ORF during virus growth, therefore initially prompting us to use the rCDV-eGFP to uncover its mutation profile and the viral quasispecies diversity after serial dozens of passages. To promote occurrence of SNMs, rCDV-eGFP-infected cells were treated with the ribavirin, which could act as a viral mutagen forcing RNA viruses into mutagenesis, and even error catastrophe (Cameron and Castro, 2001;Crotty et al., 2001).
The EC 50 value was 432 µM for the ribavirin against CDV, as determined in our previous report (Liu et al., 2020). In order to make the rCDV-eGFP gradually adapt to selective pressure, the ribavirin concentration progressively increased (240, 320, 360, 400 and 440 µM) with passaging in virus-infected cells. The P7 rCDV-eGFP was subjected to 40 serial passages (P8 to P47) in ribavirin-treated cells. The P47 progeny was speculated to form a rich diversity of viral quasispecies, since most RNA viruses were genetically unstable when they replicate in hosts (Andino and Domingo, 2015). The NGS analysis was used to compare quasispecies diversities between the rCDV-eGFP-R and -N at P47. The reason why the Sanger sequencing was not used here was its inability to quantify the complexity of mutant spectra. Alternatively, the NGS, capable of generating a large dataset to identify SNMs in viral genomes (Huang et al., 2019;Lu et al., 2020), was used in this study.
As a non-self sequence in the recombinant, the eGFP ORF would theoretically contain much more SNMs per 100 nt than viral self-sequences do at P47. Nonetheless, the NGS analysis revealed that the eGFP ORF harbored only two (A3881G and G3943A) and one (C4051G) SNMs in the rCDV-eGFP-R and -N, respectively, suggesting that despite the mutagenic pressure exerted by ribavirin during viral passaging, the eGFP ORF as such did not undergo rich SNMs in the rCDV-eGFP-R antigenome. The G3943A was a silent mutation, whereas the A3881G was a missense mutation, causing an SAAM (N147D) with mutation frequency of 7.45% in the eGFP of rCDV-eGFP-R. It remained to be elucidated whether, to some extent, this SAAM was responsible for one or two fluorescenceattenuated or even -disappeared syncytia that occasionally appeared during passaging (data not shown). Despite one missense mutation (C4051G) found in the eGFP ORF of rCDV-eGFP-N at P47, non-fluorescent syncytia were invisible during viral passages in non-treated cells (data not shown).
The other sequences were simultaneously analyzed for revealing their mutation profiles in the rCDV-eGFP-R and -N at P47. As to the rCDV-eGFP-R, the N, P, M, F and H ORFs were demonstrated to have similar mutation frequencies to that of the eGFP ORF. The L ORF had an approximately 2-fold higher mutation frequency than the other ORFs did in the rCDV-eGFP-R. Interestingly, out of 36 SNMs in the L ORF, 22 congregated in a short region (nt 15883 to 16417) closer to the 3' end of L gene, implying this region being lowly conserved ( Figure 4A). In comparison with the rCDV-eGFP-R, the rCDV-eGFP-N showed a low mutation frequency across the full-length antigenome ( Figure 4B). On the one hand, the rCDV-eGFP-N showed only nine SNMs in its L ORF. On the other hand, a single SNM was not identified in its UTRs. Morbillivirus V protein is produced from its P gene through a frame shift, by the incorporation of one G residue during transcription at a particular mRNA editing site (5'-UUAAAAAGGGCACAG-3'), which is conserved among morbilliviruses (Mahapatra et al., 2003;Muhammad et al., 2013). Indeed, by means of the NGS, we identified this site (data not shown), into which one extra G residue was inserted to form an edited fragment (5'-UUAAAAAGGGGCACAG-3').
The rCDV-eGFP-R and -N had 62 and 23 SNMs, respectively, confirming our previous speculation that the ribavirin was able to induce error-prone replication of the rCDV-eGFP in vitro. Interestingly, the comparative analysis showed no SNM that coexisted in both progenies at P47, suggesting that irrespective of ribavirin-exerted pressure, mutation events randomly occurred during viral passaging. Transitions are the types of mutations associated with ribavirin mutagenesis. Agudo et al. (2010) demonstrated that via modulation of transition types, viral adaptation to ribavirin led to extinction-escape (Agudo et al., 2010). Indeed, the present study also exhibited that the rCDV-eGFP-R had a much higher ratio of transition/transversion (60/2, Figure 3B) than the rCDV-eGFP-N did (14/9, Figure 3D).
The RdRp-caused error rate is normally a primary driving force of mutation frequencies observed in RNA virus populations (Borderia et al., 2016). Morbilliviruses are characterized by high mutation frequencies in their genomes during serial passages in vitro, mainly attributed to the lack of effective proofreading activities in their L proteins [see our review in . Measles virus has shown a spontaneous mutation rate of 1.8 × 10 −6 /nt/replication under nonselective conditions (Zhang et al., 2013). An earlier report estimated the measles virus with a mutation rate of 9 × 10 −5 /nt/replication, and with a genomic mutation rate of 1.43/replication (Schrag et al., 1999).
Our previous report revealed that the N ORF of wild-type small ruminant morbillivirus (SRMV) underwent the most mutation events among the six structural genes during 90 serial passages in non-treated VDS cells . Unfortunately, the data in this report was based on the Sanger sequencing. We subsequently rescued an eGFP-tagged recombinant SRMV (rSRMV-eGFP) (Liu et al., 2019), followed by 45 serial passages in ribavirin-treated VDS cells. More recently, its mutation profiles with serial passaging were uncovered via the NGS analysis , revealing that a total of 34 SNMs, including 5 silent, 21 missense and 1 nonsense mutations, arose with passaging. The L ORF was found to harbor only 8 SNMs, all of which were missense mutations. The eGFP had one nonsense mutation, causing that non-fluorescent syncytia became gradually visible with passaging. Compared with the rSRMV-eGFP that underwent 45 passages with ribavirin screening, the rCDV-eGFP-R in the present study showed a high mutation frequency in its L ORF, but no nonsense mutation in its eGFP ORF.
Historically, a virulence-attenuated CDV (Rockborn strain) was demonstrated to revert back to a virulent status after serial passages in dogs (Appel, 1978). Dogs, vaccinated with a polyvalent vaccine containing the Rockborn strain, exhibited suspected encephalitis. The Rockborn strain was withdrawn from several markets after the mid 1990s (Martella et al., 2011). Considering a potential risk factor for reversion to virulence, it is necessary to screen for high-fidelity CDV strains for exploring their own mechanisms in high-fidelity replication. In the present study, the mutagen-resistant progeny was demonstrated to be with rich quasispecies diversity at P47. According to the classical theory of viral quasispecies (Lauring and Andino, 2010;Domingo et al., 2012;Andino and Domingo, 2015;Borderia et al., 2016), we speculated that the P47 progeny of rCDV-eGFP-R might be composed of high-, moderate-and low-fidelity mutants. If the high-fidelity ones can be purified from such a mutant swarm, key SAAMs in the L protein would clarify a molecular mechanism of viral high-fidelity replication.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI (accession: PRJNA752812).

AUTHOR CONTRIBUTIONS
FL conducted experiments and wrote the manuscript. NW, JL, QW and YH performed the experimental works. YZ and HS provided the fundings. HS supervised the project. All authors contributed to the article and approved the submitted version.