Reassortment Between Divergent Strains of Camp Ripley Virus (Hantaviridae) in the Northern Short-Tailed Shrew (Blarina brevicauda)

Genomic reassortment of segmented RNA virus strains is an important evolutionary mechanism that can generate novel viruses with profound effects on human and animal health, such as the H1N1 influenza pandemic in 2009 arising from reassortment of two swine influenza viruses. Reassortment is not restricted to influenza virus and has been shown to occur in members of the order Bunyavirales. The majority of reassortment events occurs between closely related lineages purportedly due to molecular constraints during viral packaging. In the original report of Camp Ripley virus (RPLV), a newfound hantavirus in the northern short-tailed shrew (Blarina brevicauda), phylogenetic incongruence between different genomic segments suggested reassortment. We have expanded sampling to include RPLV sequences amplified from archival tissues of 36 northern short-tailed shrews collected in 12 states (Arkansas, Iowa, Kansas, Maryland, Massachusetts, Michigan, Minnesota, New Hampshire, Ohio, Pennsylvania, Virginia, Wisconsin), and one southern short-tailed shrew (Blarina carolinensis) from Florida, within the United States. Using Bayesian phylogenetic analysis and Graph-incompatibility-based Reassortment Finder, we identified multiple instances of reassortment that spanned the Hantaviridae phylogenetic tree, including three highly divergent, co-circulating lineages of the M segment that have reassorted with a conserved L segment in multiple populations of B. brevicauda. In addition to identifying the first known mobatvirus-like M-segment sequences from a soricid host and only the second from a eulipotyphlan mammal, our results suggest that reassortment may be common between divergent virus strains and provide strong justification for expanded spatial, temporal, and taxonomic analyses of segmented viruses.


INTRODUCTION
Genomic reassortment is an important evolutionary mechanism by which phylogenetically distinct, segmented viral strains that co-infect a host cell may shuffle gene segments to generate new viral genotypes (Vijaykrishna et al., 2015;Lowen, 2018). Reassortment is possible for any segmented virus and occurs during the packaging stage of viral replication where individual segments from two or more strains are combined into a single virion. While a majority of reassortment events are thought to result in deleterious combinations and subsequent reduction in fitness, reassortment on occasion can lead to novel combinations that can result in increased fitness (McDonald et al., 2016). However, the extent of reassortment and associated clinical implications, such as increased pathogenicity, are not well-defined. In influenza A virus, reassortant strains have been repeatedly documented to increase virulence compared to the respective parental strains (White and Lowen, 2018). For example, the H1N1 strain responsible for the 2009 influenza A epidemic was a reassortant composed of several divergent strains of influenza virus, including at least two from swine, and one each from birds and humans (Trifonov et al., 2009). Reassortment is not restricted to influenza virus. It has been suggested that a large number, possibly most, currently recognized members of the order Bunyavirales are reassortants (Briese et al., 2013). This includes evidence of reassortment in disease-causing viruses, such as Crimean-Congo hemorrhagic fever virus (Hewson et al., 2004), Rift Valley fever virus (Liu et al., 2017), Ngari virus (Lowen, 2018), and many hantaviruses (Klempa, 2018).
Possible implications to human health posed by viral reassortment may be profound, given that global movement of people and changing species distributions due to environmental disruption may increase novel interactions of divergent viruses (Carlson et al., 2017(Carlson et al., , 2020. Understanding the extent of viral reassortment among wild reservoir host populations, as opposed to in vitro experiments, and the effectiveness of segment molecular incompatibilities in constraining successful reassortment, is dependent on spatial and temporal screening of diverse wild hosts at the community and population levels, as well as concerted efforts to produce genomic level sequences for comparative analyses. Viruses of the family Hantaviridae contain three genomic segments named for their relative size, small (S), medium (M), and large (L), which encode the nucleocapsid protein, glycoprotein precursor, and RNA-dependent RNA polymerase, respectively (Plyusnin et al., 1996;Abudurexiti et al., 2019). Klempa (2018) provides a summary of the role reassortment has played in the evolution of hantaviruses across multiple temporal scales. For example, Bruges virus in the European mole (Talpa europaea) is likely the result of one or more ancient reassortment events (Laenen et al., 2018). In contrast, other studies suggest recent reassortment events in Sin Nombre virus harbored by the deer mouse (Peromyscus maniculatus) (Henderson et al., 1995;Li et al., 1995;Black et al., 2009) and Puumala virus in the bank vole (Myodes glareolus) (Razzauti et al., 2008(Razzauti et al., , 2009). Briese et al. (2013) suggested that reassortment may be quite common and that all members of the order Bunyavirales, of which Hantaviridae is a member, could be reassortants or descendants of historical reassortment events.
Camp Ripley virus (RPLV) is a hantavirus originally discovered in the northern short-tailed shrew (Blarina brevicauda) (Order Eulipotyphla, Family Soricidae) . Bennett et al. (2014) suggested that the S segment of RPLV could be the result of recombination due to its ambiguous placement in the hantavirus phylogenetic tree; however, that assessment was based on sequence coverage from a single sample that limited insight into the evolutionary history of RPLV. With expanded sampling of RPLV from multiple populations and new statistical approaches, we have been able to develop a clearer picture of host switching and reassortment within RPLV. However, much more extensive sampling across populations and geography of B. brevicauda is still necessary to achieve a truly complete history.

Ethics Statement
All field procedures for trapping of shrews and well-established protocols for processing and preserving their tissues were reviewed and approved by the Institutional Animal Care and Use Committee of the University of New Mexico, under protocol number 19-200908-MC.

Sampling and Sequencing
Total RNA was extracted from liver and/or lung tissue, using the PureLink Micro-to-Midi total RNA purification kit, from 101 B. brevicauda and 10 southern short-tailed shrews (B. carolinensis) archived at the Museum of Southwestern Biology at the University of New Mexico (Table 1). Shrews were collected between 1980 and 2001 and represent sampling across a large portion of the distributions of both species in the United States (Figure 1). Complementary DNA (cDNA) was synthesized using the SuperScript III First-Strand Synthesis System (Invitrogen, San Diego, CA) with a universal oligonucleotide primer (5 ′ -TAGTAGTAGACTCC−3 ′ ) designed from the conserved 3 ′ -end of the S, M, and L segments of hantaviruses .
Gene amplification was carried out in 20-µL reaction mixtures containing 250 µM dNTP, 2 mM MgCl 2 , 1 U of AmpliTaq polymerase (Roche, Basel, Switzerland), and 0.25 µM of oligonucleotide primers, designed from highly conserved regions of previously identified soricid-borne hantaviruses. A listing of the oligonucleotide primers used to amplify the S, M, and L segments is provided in Supplementary Table 1. Initial denaturation was followed by touchdown PCR cycling (twodegree step-down annealing from 48 to 38 • C for 40 s) and elongation at 72 • C for 1 min, then 32 cycles of denaturation at 94 • C for 40 s, annealing at 42 • C for 40 s, and elongation at 72 • C for 1 min, in a GeneAmp PCR 9700 thermal cycler (Perkin-Elmer, Waltham, MA, USA). Amplified products were separated by electrophoresis on 1.5% agarose gels and purified using the QIAQuick Gel Extraction Kit (Qiagen, Hilden, Germany). DNA was sequenced directly using an ABI Prism 377XL Genetic Analyzer (Applied Biosystems Inc., Foster City, CA, USA). Numbers in the parentheses represent the length of sequence recovered for that segment.

Phylogenetic Analysis
Additional sequences representative of all currently recognized species of Hantaviridae were downloaded from GenBank (Supplementary Table 2). Nucleotide alignments were generated for each segment using MAFFT v7.402 (Multiple Alignment using Fast Fourier Transform) (Katoh and Standley, 2013) with the Smith-Waterman algorithm (-localpair and -maxiterate 1000). Alignments were visually inspected in Geneious v. 8 (https://www.geneious.com). The 3 ′ -and 5 ′ -ends that were poorly aligned and had limited coverage across samples were trimmed to ensure analysis of homologous regions. This process was repeated for the amino acid translation of each segment to address possible artifacts introduced by high levels of nucleotide variation and potential homoplasy between deeply divergent sequences. Bayesian phylogenetic inference was performed for nucleotide and amino acid alignments in MrBayes 3.2.6 (Ronquist et al., 2012) with a mixed model of evolution and a gamma distribution with the command "lset nst = mixed rates = gamma" and run for 10,000,000 generations with trees sampled every 1,000 generations. Convergence was verified by inspecting that standard deviation of split frequencies between runs was under 0.01, as recommended, with trees sampled every 1,000 generations. Convergence was tested using Tracer v.1.7.1 (Rambaut et al., 2018) to verify the effective sample size for the posterior was above 200. A burn-in of 25%, as recommended (Ronquist et al., 2012), was used prior to generating a 50% majority rule consensus tree. Means of raw p-distances between groups and within groups were calculated on the amino acid alignment in MEGA7 (Kumar et al., 2016).

Tests of Reassortment
Reassortment was tested for the M and L segments using the Graph-incompatibility-based Reassortment Finder program (GiRaF version 1.01) (Nagarajan and Kingsford, 2011) with the tree files produced by MrBayes as input. The tree files for each segment were used for graph-mining to determine gene segments with incompatible phylogenies. A reduced dataset was used for analysis in GiRaF to accommodate the requirement of identical taxa between trees. A 25% burn-in, 50% cull, all candidate set, non-star bicliques, and single bicliques options were used for the GiRaF analysis. While GiRaF was originally written for reassortment analysis of influenza viruses, it is broadly applicable to any segmented virus capable of reassortment. A tanglegram of proposed reassorted sequences was visualized in R with the package "phytools" (Revell, 2012). As a secondary test of reassortment the program RDP4 was used (Martin et al., 2015). Samples with sequences for both the L and M segment were concatenated with the S segment, when available. Reassortment breakpoints were tested by using the RDP, GENECONV, Bootscan, Maxchi, Chimera, SiSscan, and 3Seq methods. The highest possible p-value for accepting a reassortment event was set to 0.05 and all other parameters set to default.

Phylogenetic Analysis and Tests of Reassortment
Virus sequences were recovered for 36 B. brevicauda and one B. carolinensis (Figure 1 and Supplementary Table 3).
Phylogenies generated independently for all three segments showed differing levels of incongruence. Trees inferred from the M segment recovered three distinct, deeply paraphyletic clades across the Hantaviridae (Figures 1, 2). The largest clade recovered was predictably part of the Orthohantavirus clade that includes other eulipotyphlan-borne hantaviruses (harbored by shrews and moles) from both the Old World and New World and represents the majority of our samples (labeled Ripley1). The Ripley1 clade was further subdivided into two wellsupported subclades that roughly encompassed the geographic distribution of B. brevicauda, with an eastern subclade of shrews from Ohio, Massachusetts, and Pennsylvania, and a western subclade from Minnesota, Iowa, Wisconsin, and Kansas. This East-West division was also reflected in previously generated host phylogenies inferred from cytochrome b (Brant and Orti, 2003 (Kang et al., 2009;Laenen et al., 2018). Mean within-group amino acid p-distances for each clade in the M segment ranged from 2% to 4% divergence. Mean betweengroup distances were much larger, however, with 70% amino acid sequence identity between Ripley1 and Ripley2. Ripley3 shared 47% amino acid sequence similarity with Ripley1 and 49% with Ripley2. The L-segment phylogenies did not match the pattern of diversification seen in the M segment, with all L-segment samples forming a single, well-supported monophyletic clade within Orthohantavirus (Figure 3). That clade was composed of virus segments representative of each clade seen in the M segment, however, the L-segment phylogeny placed MSB49712, the single M segment Ripley3 representative, as basal to the rest of RPLV. RPLV formed a sister relationship with Oxbow virus from the American shrew mole (Neurotrichus gibbsii) and Tigray virus from the Ethiopian white-footed mouse (Stenocephalemys albipes), and in turn were part of the well-supported clade of the genus Orthohantavirus. Within the larger clade of RPLV variation was rather low with a mean p-distance about 1.5% divergence. However, when compared to MSB49712, the mean distance was 11%.
Although our sampling for the S segment was less complete than for the M and L segments, we recovered two distinct RPLV clades (Figure 3), with the major S-segment clade containing representatives of both Ripley1 and Ripley2 from the M-segment phylogeny. The one representative from Ripley3 in the S-segment phylogeny, MSB49712 from Arkansas, was related to the Hainan oriental leaf-toed gecko virus (HOLGV), a non-mammalian borne hantavirid-like virus from China, sharing only 43% amino acid sequence similarity with the rest of RPLV. Interestingly, the placement of HOLGV in both the M-and L-segment phylogenies disagreed with the findings in the original report of HOLGV (Shi et al., 2018).
GiRaF analyses identified 11 candidate sets of reassortment with high confidence (>0.95). Of these, nine contained only sequences from RPLV (Supplementary Table 4). Additionally, all RPLV included in the reduced dataset for GiRaF were consistently identified as reassortants. However, our GiRaF settings were set to the highest sensitivity to detect reassortment across the phylogeny, which also resulted in detection of reassortment among subclades within Ripley1 due to incongruence of internal branching structure toward the tips (Supplementary Figure 1). RDP4 identified six samples as likely reassortants with significant p-values depending on the method used (Supplementary Table 5). The six identified samples represented all three RPLV clades in the M-segment phylogeny. The breakpoints identified as the sites of reassortment corresponded to the concatenation points between the L, M, and S segments (Supplementary Table 6).

DISCUSSION
Reassortment among divergent viral strains can catalyze rapid evolution and result in serious consequences for human health. Benign parental strains, when reassorted, can introduce novel interactions with host systems and increase the ability to evade the host immune system or the likelihood of a virion to enter a host cell, leading to increased virulence (Vijaykrishna et al., 2015;McDonald et al., 2016). However, reassortment has been generally thought to be restricted to closely related viral strains due to molecular incompatibilities that inhibit reassortment among highly divergent viruses in vivo (McDonald et al., 2016;Klempa, 2018;White and Lowen, 2018). In this study, we show a possible example of reassortment between deeply diverged viruses differing by 30-50% in the amino acid sequence of the M segment. While our understanding of how common reassortment between distantly related hantaviruses is limited, this study suggests that such events might not be so restrained. However, our ability to fully explore this for RPLV is limited by the lack of sequence representation for all segments, as well as by only partial coverage across the RPLV genome.
The hantavirus species demarcation criteria, published in the Ninth Report of the International Committee on Taxonomy of Viruses (King et al., 2012), which required at least a 7% amino acid sequence difference in the complete nucleocapsid and glycoprotein complex, have been abandoned. And DivErsity pArtitioning by hieRarchical Clustering (DEmARC) analysis of the coding regions of the complete S-and M-segment sequences has been used to objectively establish taxonomic classification of the family Hantaviridae since 2017 (Laenen et al., 2019). The limitations of our data set (that is, lacking the full-length S-and M-segment sequences of hantavirids from B. brevicauda and B. carolinensis) do not allow definitive conclusions about whether or not the three RPLV clades represent three different hantavirus species.
Co-circulation of distinct hantaviruses has been reported previously in other eulipotyphlan species, including Seewis and Altai-like viruses in the Eurasian common shrew (Sorex araneus) in Finland (Ling et al., 2014), Bruges and Nova viruses in the European mole in western Europe (Laenen et al., 2016), and most recently, Seewis, Altai, and Altai-like viruses co-circulating in sympatric populations of Eurasian common shrew, Laxmann's shrew (S. caecutiens), and flat-skulled shrew (S. roboratus) in Hungary and Russia (Kang et al., 2019). Here, we report three distinct M-segment lineages of RPLV co-circulating in B. brevicauda in the United States. However, co-circulation of distinct viruses within a single species does not necessarily indicate either current or historical reassortment. For example, despite Bruges and Nova virus co-infections of the same host individual on multiple occasions (Laenen et al., 2018), evidence of reassortment between the two viruses has yet to be documented. That is not the case for Seewis and Altai-like viruses cocirculating in shrews where phylogenetic disagreement between segments suggests an ancient history of host-switching and reassortment although this has yet to be tested statistically (Ling et al., 2014;Kang et al., 2019).
We recovered a similar pattern of disagreement between phylogenies for RPLV. Incongruencies in tree topology were found across all three segments of the RPLV genome. Further, the L-segment monophyly within RPLV was consistent with a model of classical co-diversification with its shrew host, whereas the M segment was composed of at least three unique histories suggesting multiple instances of successful host switches followed by subsequent reassortment of the L and S segments. Our sampling across genes limits our ability to more robustly test the history of reassortment in this system. Two samples that do have sequences for the L, M, and S segments, MSB89866 and MSB89863, were identified by RDP4 as containing the M segment from the Ripley2 clade and the L and S segments from representatives of the Ripley1 clade. This would follow the pattern of reassortment within Hantaviridae of the L and S segments packaged together with a divergent M segment, although this is not always the case (Klempa, 2018). Expanded sampling across the entire RPLV genome and more individual hosts will be necessary to adequately address this question.
Our analysis of the L segment was limited to a relatively short section (347 bp), potentially affecting the reliability of phylogenetic inference. Still, while deeper nodes lack adequate posterior support, RPLV was well supported as distinct and monophyletic in the L-segment phylogeny (Figure 3). This pattern is difficult to explain biogeographically considering the depth of divergence and geographic distance between the mammalian hosts of the viruses that are sister to each RPLV clade; for example, Bruges virus and Ripley2 (Figure 3). It is possible that the Ripley2 and Ripley3 clades represent ancient lineages that predate the migration of the ancestor of Blarina to North America and have since been maintained. We find this scenario unlikely given the deep history of host colonization of North America (Brant and Orti, 2003), coupled with the high mutation rate of hantaviruses that should have a monophyletic lineage of hantaviruses in Blarina, which is not what we see. Another possibility is that the placement of the Ripley2 and Ripley3 clades in the M-segment phylogeny is strictly stochastic or an artifact of long-branch attraction among deeply divergent viruses. However, this scenario does not address how such deeply divergent strains are maintained within a single species. Lastly, the Ripley2 and Ripley3 clades could represent a relatively recent host switch to B. brevicauda. Implicit in this explanation is that because geography precludes direct contact and transmission between host species there is likely a significant amount of hantavirus diversity that has yet to be discovered, which would help us better interpret these biogeographical gaps. It is likely that B. brevicauda is the primary host of RPLV (Ripley1 clade) and that the Ripley2 and Ripley3 clades are the result of recent hostswitching events and subsequent reassortment from unknown hantaviruses yet to be discovered. It is important to note, however, that the topologies recovered here, specifically for the L and S segments, could be prone to bias and artifacts during phylogenetic inference due to the relatively short L-segment sequences and the limited S-segment representation.
Overall, our data suggest that reassortment among highly divergent viral strains may be rather common. Evidence of reassortment between deeply divergent viruses highlights the urgency of more extensive sampling of the global virome (French and Holmes, 2020) and hantaviruses, in particular, across diverse mammalian and non-mammalian taxa, including recently discovered hantavirid-like viruses recovered from reptiles, fish, and mosquitoes (Li et al., 2015;Shi et al., 2018). The diversity and evolutionary dynamics of hantaviruses remain limited, as does knowledge of the complete suite of organisms that host them. Continued research focused on increasing breadth of taxa examined, as well as population level studies within individual host species, is required if we are to develop a more complete history of the origin and diversification of the Hantaviridae family.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The animal study was reviewed and approved by the Institutional Animal Care and Use Committee of the University of New Mexico, under protocol number 19-200908-MC.

AUTHOR CONTRIBUTIONS
SL, JC, and RY wrote and edited the manuscript. SL performed phylogenetic and statistical analyses. HK, SA, and SG generated the sequence data and assisted in editing the manuscript.