Identification of a New Infectious Pancreatic Necrosis Virus (IPNV) Variant in Atlantic Salmon (Salmo salar L.) that can Cause High Mortality Even in Genetically Resistant Fish

Infectious pancreatic necrosis (IPN) is an important viral disease of salmonids that can affect fish during various life cycles. In Atlantic salmon, selecting for genetically resistant fish against IPN has been one of the most highly praised success stories in the history of fish breeding. During the late 2000s, the findings that resistance against this disease has a significant genetic component, which is mainly controlled by variations in a single gene, have helped to reduce the IPN outbreaks to a great extent. In this paper, we present the identification of a new variant of the IPN virus from a field outbreak in Western Norway that had caused mortality, even in genetically resistant salmon. We recovered and assembled the full-length genome of this virus, following the deep-sequencing of the head-kidney transcriptome. The comparative sequence analysis revealed that for the critical amino acid motifs, previously found to be associated with the degree of virulence, the newly identified variant is similar to the virus’s avirulent form. However, we detected a set of deduced amino acid residues, particularly in the hypervariable domain of the VP2, that collectively are unique to this variant compared to all other reference sequences assessed in this study. We suggest that these mutations have likely equipped the virus with the capacity to escape the host defence mechanism more efficiently, even in the genetically deemed IPN resistant fish.


INTRODUCTION
Infectious pancreatic necrosis (IPN) is one of the leading viral diseases of the farmed Atlantic salmon (Salmo salar L.). The disease was first reported in 1941, following an outbreak in brook trout (Salvelinus fontinalis L.) in Canada (M'Gonigle, 1941). Further, the characterisation of the causative agent, the IPN virus (IPNV), was performed in 1960 (Wolf et al., 1960). In Norway, the virus was first isolated in 1975 from freshwater rainbow trout (Oncorhynchus mykiss L.) (Hastein and Krogsrud, 1976) and was designated as a notifiable disease from 1991 till 2008 (Sommerset et al., 2020). The virus can infect Atlantic salmon during all of its developmental stages, but the fish are especially susceptible as fry, during start-feeding, and post-smolts, soon after transfer to seawater (Roberts and Pearson, 2005;Sommerset et al., 2020). As suggested by the name, the pancreas is the primary target tissue of IPNV. However, the liver has also been shown to be one of the key organs affected by this virus (Ellis et al., 2010;Munang'Andu et al., 2013). In addition to significant economic losses, this disease is a concern for animal welfare, as survivors following infection can continue to infect naïve fish (Roberts and Pearson, 2005).
IPNV is a double-stranded RNA virus that belongs to the Birnaviridae family. A characteristic feature of the birnaviruses is their possession of a bi-segmented genome (i.e., A and B segments), contained within a non-enveloped, icosahedral capsid (Dobos and Roberts, 1983;Duncan and Dobos, 1986;Duncan et al., 1987). The smaller genomic segment, the B segment, is about 2.5 kb in size and consists of a single open reading frame (ORF), encoding an RNA-dependent, RNA polymerase VP1 (approximately 90 kDa) (Duncan and Dobos, 1986;Duncan et al., 1987). The A-segment is approximately 3 kb, and it contains two partially overlapping ORFs (Duncan and Dobos, 1986). The first ORF encodes VP5, a small, cysteine-rich, non-structural protein of about 17 kDa. A precursor polyprotein (NH2-preVP2-NS VP4 protease-VP3-COOH) of approximately 106 kDa constitutes the second and the larger ORF. The encoded VP4 protease cleavages the polyprotein into its two main components, the preVP2 and VP3. The preVP2 further matures to become VP2 and forms the outer capsid protein, containing the neutralizing epitopes and sites that facilitate cell attachment (Heppell et al., 1995b;Dopazo, 2020). The VP3 is the inner capsid protein (Duncan and Dobos, 1986;Duncan et al., 1987) but also, in association with the VP1, it seems to be involved in viral packaging and replication (Tacken et al., 2002).
Despite extensive vaccination efforts, disinfection and targeted breeding programs, IPN outbreaks remains a concern in farmed Atlantic salmon. Related mortalities can be substantial and, in part, can be a function of the genetic makeup of the host (Guy et al., 2006;Houston et al., 2008;Moen et al., 2009), management and environmental factors (Jarp et al., 1995;Sundh et al., 2009), and the strain and specific genetic variations carried by the virus Skjesol et al., 2011). There are different examples of isolates belonging to the same serotype to cause different degrees of mortality in the host or losing their virulence after serial passages in cell culture (Dorson et al., 1978;Bruslind and Reno, 2000;Santi et al., 2004;Shivappa et al., 2004;Song et al., 2005). Indeed, many studies have investigated the molecular basis of IPNV virulence in Atlantic salmon (e.g., Santi et al., 2004Santi et al., , 2005Song et al., 2005;Skjesol et al., 2011;Mutoloki et al., 2016). Although the genetic details of such processes are not yet fully understood, most studies speculated, and chiefly investigated, the genetic variations in segment A concerning the degree of virulence in IPNV. In particular, specific amino acid motifs in the VP2 of the polyprotein have been suggested to be most prevalent and in association with the virulent forms of the pathogen (Bruslind and Reno, 2000;Munang'andu et al., 2016;Mutoloki et al., 2016;Santi et al., 2004;Shivappa et al., 2004;Song et al., 2005). Variations in the amino acid residues in positions 217, 221 and 247 of the VP2 have been of particular focus (Dopazo, 2020;Julin et al., 2013;Munang'andu et al., 2016;Mutoloki et al., 2016;Santi et al., 2004;Song et al., 2005). Further, assessment of the recombinant viral strains has indicated that a threonine (T) residue rather than proline (P) in position 217 of the VP2 might be the most crucial amino acid that can result in the loss or gain of virulence (e.g., Song et al., 2005). Nonetheless, there has also been evidence from field outbreaks and challenge experiments that IPNV variants with a P substitute in this position can lead to high mortalities (Bain et al., 2008;Ruane et al., 2009Ruane et al., , 2015Tapia et al., 2015;Maj-Paluch et al., 2020).
Despite concerns and economic losses due to infection by IPNV, increased resistance against this virus through selective breeding is among the most remarkable success stories in the history of aquaculture and salmon breeding in particular. The initial attempt to estimate the genetic parameters controlling this trait, which was based on mortality in the field and controlled challenge testing, had shown that resistance to IPNV has a significant and heritable genetic component (h 2 of 0.17-0.62) (Wetten et al., 2007;Kjøglum et al., 2008). However, the breakthrough came with the finding that resistance to IPNV is mainly controlled by variations in a single quantitative trait locus (QTL) on chromosome 26 (Houston et al., 2008, Houston et al., 2010Moen et al., 2009). Subsequent work identified mutations in the epithelial cadherin gene (cdh1) as the likely causative genetic variation for IPNV resistance in Atlantic salmon (Moen et al., 2015). Following the inclusion of this QTL into the breeding programs in Norway, the industry witnessed a sharp decline in the number of IPN outbreaks throughout the country, dropping from 223 to only 19 reported cases from 2009 to (Sommerset et al., 2020. This paper reports the identification and whole-genome sequence assembly and analysis of a new IPNV variant. The isolates were recovered from a field outbreak in the West of Norway following reported mortalities, due to an IPN infection, Frontiers in Genetics | www.frontiersin.org November 2021 | Volume 12 | Article 635185 among QTL homozygous or heterozygous IPN resistant fish. We describe the amino acid motifs that distinguish these isolates from other phylogenetically related variants and suggest that these mutations are likely to play an essential role in this variant pathogenicity.

Field Outbreak and Sample Collection
In May of 2019, there was an IPN field outbreak in a Western region of Norway. The affected fish were Atlantic salmon of the SalmoBreed strain, delivered as unvaccinated, IPN QTL, all favorable homozygous or heterozygous for the QTL locus (Houston et al., 2008;Moen et al., 2009). Fish were at the post-smolt stage with an approximate average weight of about 700-800 gr at the time of the outbreak. The reported mortality reached about 10%, a high percentage for a field outbreak among QTL carrying fish. The IPNV infection was identified following an inspection by the veterinary authorities. The diagnosis was later validated by detection of the VP2 segment of the viral genome using polymerase chain reaction (qRT-PCR) of the infected head-kidney and through histopathological and immunohistochemical examinations of the liver and pancreatic tissues ( Figures 1A,B), all performed at the Fish Vet Group Norway (http://fishvetgroup.no) (Supplementary Table S1). Tissues were fixed in a 4% formalin solution (4% formalin, 0.08 M sodium phosphate, pH 7.0), processed using Thermo Scientific Excelsior ® and embedded in Histowax with the aid of Tissue-Tek ® , TEC 5 (Sakura) embedding system. The embedded tissues were then sectioned at 1.5-2 µm using a Leica RM 2255 Microtome and stained with Hematoxylin-Eosin (HE). The stained sections were scanned in an Aperio ScanScope AT Turbo slide scanner and read using Aperio Image Scope (Leica Biosystems). The scans were examined and scored in a blind fashion, i.e., without information about the animal's history. A clip from the adipose fin was also sampled and stored in 95% ethanol for DNA extraction and genotyping of ten fish, five moribund and five dead. We genotyped these fish for the genetic markers used to assign individuals as QTL carriers (Houston et al., 2008;Moen et al., 2009Moen et al., , 2015 and confirmed that all animals carry IPN QTL.

Transcriptome Sequencing and Viral Genome Assembly
The head-kidney was collected from six infected fish (http:// fishvetgroup.no/pcr-2/#1452163253406-d81f6dce-bdf9), all at a moribund state, and stored in RNALater ® (Ambion). Total RNA extraction was performed using the RNeasy ® Plus mini kit (Qiagen). Nanodrop ND-1000 Spectrophotometer (NanoDrop Technologies) was used to assess the concentration and purity of the extracted genetic materials. The Norwegian Sequencing Centre performed library construction and sequencing of the total RNA on an Illumina HiSeq 4000 platform as paired-end (PE) 150 bp reads. We first cleaned the sequence data by removing low-quality nucleotides and sequencing adapters using Trimmomatic (v.0.36) (Bolger et al., 2014). The ribosomal RNA (rRNA) was then identified through a similarity search against the SILVA rRNA database (Quast et al., 2013) and excluded from subsequent analysis. The host-specific transcriptome was detected by aligning the remaining reads against the Atlantic salmon reference genome assembly (ICSASG_v2) (Lien et al., 2016) using TopHat (v.2.0.13) (Trapnell et al., 2009, Trapnell et al., 2012. The un-aligned reads were then blasted against the IPNV reference assemblies (ASM397166v1, ASM397170v1, ASM397174v1, ASM397168v1, ASM397172V1 and ViralMultiSegProj15024) by setting "task" as "megablast," "word size" to 7 and "expectation value threshold" to 1.00 e −07 . All the reads that aligned to the IPNV reference sequences were extracted, pooled into a single sequence datafile per animal, and then fed into Trinity (v.2.11.0) (Grabherr et al., 2011) with the default parameter settings to construct assemblies. We refer to the assembled viral genomes as BG_x_y, where x denotes the animal's id (1-6) and y refers to the viral genome's specific segment (e.g., VP1 or polyprotein).
The sequences included 90 isolates with the complete coding sequence for the polyprotein and 120 for the VP1. Both the nucleotide and amino acid alignments were performed using ClustalX (Larkin et al., 2007) implemented in Unipro UGENE (v36.0) (Okonechnikov et al., 2012) or Muscle (Edgar, 2004) implemented in Seqotron (v1.0.1) (Fourment and Holmes, 2016). Unipro UGENE and Mega X (v10.1.7) (Kumar et al., 2018) were also used to assess sequence similarity, calculate pair-wise distances for nucleotide and amino acid data, and construct phylogenetic associations using the neighbour-joining (NJ) method. To assess the confidence of node assignments, we performed 1,000 bootstrap replications.

Transcriptome Sequencing and Genome Assembly
This paper describes the genomic features of IPNV isolates responsible for a field outbreak in an Atlantic salmon population. The post-mortem examination identified IPN infection as the chief cause of mortality among all fish investigated. The histopathological and immunohistochemical examinations ( Figures 1A,B; Supplementary Table S1) and the viral genetic material detected through qRT-PCR amplification (Supplementary Table S1) further confirmed earlier diagnosis. We found that the genotypes of all animals examined in this study were consistent with an IPN resistant animal (Houston et al., 2008;Moen et al., 2009;Moen et al., 2015). Further, assessment of the transcriptome sequence data confirmed that all fish carried at least one favorable copy of the causal mutation in the cdh1 gene, conferring resistance against IPNV (Moen et al., 2015). The causal mutation, which is in the form of a single nucleotide polymorphic (SNP) variation, is located in the protein-coding sequence of cdh1 (ssa26: 15192533; C/T) and can change Proline (P) to Serine (S), with the former amino acid found to be associated with the IPNV resistance (Moen et al., 2015). It is expected that in a heterozygote fish, the favorable allele exhibit a close to complete dominant effect in response to the virus (Moen et al., 2015). Among the animals investigated in this study, three fish were homozygous for the favorable allele (C/C), and the other three were heterozygous a Double peak on the chromatograms after two cell culture passages. The dominating amino acid is indicated first.
Frontiers in Genetics | www.frontiersin.org November 2021 | Volume 12 | Article 635185 (C/T). The extracted RNA was sequenced to high depth, with an average of 60 million PE reads per animal. Following the rRNA's exclusion, more than 45 million PE reads per fish remained for the assemblage of the host's viral genome and profiling gene expression. Together, the two segments of the IPNV genome consist of about 5,800 nucleotides. On average, we obtained 56fold coverage of the viral genome from each animal in our sequence data.

Comparative Sequence Analysis of the Polyprotein
Comparative genomic analysis between the six assembled genomes revealed at least two different, closely linked variants, caused the field outbreak in the population investigated ( Figures  2A,B; Supplementary Table S2A, S2B). The nucleotide sequences of the polyprotein are identical in four of the assemblies (BG_1_polyprot to BG_4_polyprot) and contain nine polymorphic sites compared to the other two assembled sequences (i.e., BG_5_polyprot and BG_6_polyprot). Two of the polymorphisms are nonsynonymous, changing the amino acid residues in positions 47 (aspartate (D) to glutamate (E)) and 717 (lysine (K) to glutamine (Q)).
In a study aiming to trace the IPNV infection signatures from hatcheries to the sea (Kristoffersen et al., 2018), the authors have recovered a partial fragment of the polyprotein (2,772 bp) from an isolate (accession number: MH562009), with very similar nucleotide (∼99.6%) and amino acid (∼99.8-100%) sequences to the ones reported here (Supplementary Figures S1A, S1B). Recently, in controlled infection challenge testing, the mortality induced by this isolate among full-sib families was compared with a standard, virulent form of the virus (AY379740.1). While the infection with the latter variant resulted in approximately 30% mortality, mainly among the non-QTL fish, the former isolate caused more than 75% mortality, treating the QTL and non-QTL fish alike (Ørpetveit et al. in preparation; https://www.vetinst.no/ nyheter/endringer-i-ipn-viruset-gjor-fisken-mer-utsatt-forsykdom, https://www.fishfarmingexpert.com/article/changes-inipn-virus-make-salmon-more-susceptible, https://salmobreed. no/articles/benchmark-genetics-intensiverer-forskningen-panytt-ipn-virus). Comparison between the nucleotide sequences of this isolate with our assembled genomes shows divergence ranging from 0.003 to 0.004 (Supplementary Figure S1A). However, this isolate has deduced amino acid sequences identical to the translated sequences from our first four assemblies (Supplementary Figure S1B), indicating that this  Table 2. The associated genogroups are in parentheses. Except for the Sp 975/99, which was recovered from the Shetland Isles, Scotland (Smail et al., 2006), all the Sp serotypes are based on different field outbreaks found in the Northwest of Norway (Santi et al., 2004). BG 1 to 6 represent isolates identified in the current work. The confidence of association between sequences was estimated using bootstrap testing (1,000 replicates) and is shown next to the branches. The evolutionary distances were computed using the Poisson correction method (Zuckerkandl and Pauling, 1965) and are in the units of the number of amino acid substitutions per site. The rate variation among sites was modeled with a gamma distribution (shape parameter 1). All ambiguous positions were removed in a pairwise deletion fashion.
Frontiers in Genetics | www.frontiersin.org November 2021 | Volume 12 | Article 635185 variant is likely to be the ancestral form and the other isolates (i.e., BG_5_VP2 and BG_6_VP2) are more recently derived. One of the main characteristics of RNA viruses is their high mutation rates, mainly attributed to their lack of an effective proofreading activity. Therefore, finding different sequence variants from the same outbreak might not be surprising, but it indeed is important, as it will allow one to follow the trajectory of pathogen evolution through time and space. Comparative assessment of the polyprotein nucleotide (data not shown) as well as the amino acid sequence data with multiple reference strains (Figure 2A) shows our assemblies are most closely related to the Sp serotype (genogroup 5). However, they also constitute a distinct group within this serotype compared to Sp isolates assessed in this study (Figure 2A; Supplementary Figure S2A). Several investigations have previously reported variations in some amino acid residues, particularly in the VP2 of the polyprotein, to generally be associated with a higher degree of the IPNV virulence (Bruslind and Reno, 2000;Munang'andu et al., 2016;Mutoloki et al., 2016;Santi et al., 2004;Shivappa et al., 2004;Song et al., 2005). Interestingly, the amino acid motifs found in the six assembled genomes in this study are mainly consistent with the virus's avirulent form ( Table 1) (Santi et al., 2004;Shivappa et al., 2004;Song et al., 2005). For instance, for the most critical positions associated with the virulence, i.e., 217 and 221 of the VP2, the amino acid residues of our assembled genomes are proline (P) and threonine (T) ( Table 1; Supplementary File S1), while it is expected that in the virulent form, the corresponding residues to be T and A, respectively (Santi et al., 2004;Song et al., 2005;Mutoloki et al., 2016). Similar patterns have also been reported from the isolates recovered from outbreaks, with the amino acid signature comparable to the non-virulent forms in the key positions (Bain et al., 2008;Maj-Paluch et al., 2020;Ruane et al., 2009, Ruane et al., 2015Tapia et al., 2015). On the other hand, we identified ten amino acid residues in the assembled genomes' polyprotein that are either exclusively or with a disproportionate frequency in the reported isolates ( Table 2; Supplementary File S1). Nine of these sites are within the VP2, and one is in the VP4. In the VP2, these residues are all close to one another, encompassing the hypervariable region (Blake et al., 2001;Heppell et al., 1995b). Specifically, these residues are in positions 245,248,252,255,257,278,282,285,321 (Table 2;Supplementary File S1). This data provides additional support for the previously suggested notion that the amino acid motifs in the VP2 capsid protein hypervariable region might play an essential role in the degree of the virulence displayed between different IPNV isolates (Bruslind and Reno, 2000;Skjesol et al., 2011). It is also likely that this specific constellation of the amino acid residues reported in these isolates had helped the virus to escape the host defence barrier, despite fish being regarded as genetically resistant to IPNV. In the VP4, the deduced amino acid residue, unique to the assembled isolates ( Table 2; Supplementary File S1), is located in position 585. The amino acid substitution is a glycine (G) residue, compared to the other sequences investigated in this study, where they carry an aspartate (D) ( Table 2).
While this paper was under review, another article also reported similar variations from isolates recovered from outbreaks in Scotland (Benkaroun et al., 2021). Taken together, these results confirm the virulence of this variant and provide some indication of the geographical regions it has spread.

Comparative Sequence Analysis of VP5
All assemblies contain the VP5, the partial overlapping, nonstructural protein, of segment A. Similar to the polyprotein, the phylogenetic analysis of the VP5 grouped the assemblies into two distinct groups and further clustered them with the isolates from the Sp serotype ( Figure 2B). The two assembled groups differ in only one amino acid residue at position 90, with the first four assemblies carrying an arginine (R) while the two other assemblies have a glycine (G) (Supplementary File S3). Sequence analysis of the deduced amino acids of VP5 did not reveal any unique residue in the assembled isolates compared to all other sequences. However, the amino acid in position 98 differs between the Sp isolates (Supplementary File S3). At this position, all the investigated Sp isolates carry an R, while in the assemblies, the residue is tryptophan (W). It has been shown that the VP5's protein is generally produced during the initial stages of replication (Heppell et al., 1995a), and at least in the Ab serotype, it seems that this protein to be involved in anti-apoptotic functions (Hong and Wu, 2002). However, this product's importance in viral growth or virulence has so far remained ambiguous (e.g., Julin et al., 2013;Dopazo 2020).

Comparative Sequence Analysis of VP1
The phylogenetic analysis of VP1 clustered the assemblies into three distinct clades (Supplementary Figure S2B Table S3A). The nucleotide sequence comparisons show a greater divergence rate in the VP1 of the assemblies than those of the polyprotein, suggesting that the B segment of the genome has a higher rate of nucleotide mutation (Supplementary Tables S1,  S2). Comparing the deduced amino acid sequences of the assembled genomes to the VP1 section of 120 IPNV sequences further revealed only a single, almost unique residue in our isolates (Supplementary File S2). The amino acid residue in position 656 of the assembled isolates is glutamate (E), while in the majority of other strains, the amino acid is either aspartate (D) or alanine (A). The only other isolates, carrying an E in this position, are two sequences recovered from Atlantic salmon (KY548520; collection date: 2001) and Brown trout (Salmo trutta L.) (KY548519; collection date: 1989) in Finland following outbreaks (Holopainen et al., 2017). However, it should be noted that so far, the role of VP1 in determining the degree of IPNV virulence has remained ambiguous. For example, in a study conducted by Song et al. 2005, through construction and comparative assessment of chimeric IPNV strains, the authors suggested that VP1 is not involved in this pathogen's virulence. On the other hand, the data reported by Shivappa et al. 2004 suggests that the amino acids on the B segment of the genome in combination with specific residues in VP2 might help modulate the magnitude of virulence. Work in Frontiers in Genetics | www.frontiersin.org November 2021 | Volume 12 | Article 635185 the infectious bursal disease virus (IBDV) has also shown that variations in the amino acid residues of the VP1 can change the kinetics of viral replication and influence the virus's virulence (Liu and Vakharia, 2004). VP1 interacts with VP3 to form VP1-VP3 complexes, and as such, can affect various aspects of the virus biology, including its replication efficiency (Tacken et al., 2002;Pedersen et al., 2007).

CONCLUDING REMARKS
In light of the findings reported in this study and similar works, a key point worth noting is the importance of understanding the genetic interactions between the host and the pathogen in determining an infection's outcome. As a general trend, research studies tend to mainly focus on either the host or the pathogen's genetic polymorphisms while investigating the genetic basis of disease-related phenotypic variations. While the merits of such a targeted approach cannot be disputed, it is becoming increasingly important to be aware of the detailed genetic and genomic variations and interactions between the host and the microbe. In the case of monogenic infectious diseases, it is relatively trivial to speculate whether a new form of a pathogen interacts in fundamentally different ways with the host's defence mechanism. For instance, considering that variations in a single gene mainly determine resistance to IPNV in Atlantic salmon, one might expect that the mutations acquired by the variant reported in this paper might have offered an adaptive advantage, assisting the pathogen to establish successful infection, irrespective of the variations in the cdh1 gene. This, for example, can be examined by challenging and comparative transcriptome analysis of the QTL and non-QTL full-sib animals following infection with any of these isolates and the common variant in a controlled setting. Of course, pinpointing the causative mutations and understanding and validating the molecular basis of such adaptive variations is a substantial undertaking by itself. The task of identifying and untangling the dynamics of how genotypic variations in the host and the pathogen influence one another and affect the outcome of infection becomes more challenging while investigating diseases with polygenetic nature. This point becomes even more critical, considering our rapidly changing environments that can facilitate the transmission dynamics and the geographical spread of the pathogens (Reid et al., 2019). Fortunately, advancements in technologies such as sequencing, genotyping, computational analysis and molecular biology can now provide us with many necessary tools if we are set to understand the dynamics of the interactions between the host and the microbe at their detailed molecular levels.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article (Supplementary Material) and have also been deposited in GenBank under the accession numbers MW496366, MW496367, MW496368, MW496369, MW496370, MW496371, MW496372, MW496373, MW496374, MW496375, MW496376 and MW496377.

ETHICS STATEMENT
Ethical review and approval was not required for the animal study as the data is based on a naturally occurring disease outbreak in a field.

AUTHOR CONTRIBUTIONS
GM, SJ and BH arranged for tissue sampling, shipment and all laboratory and sequencing procedures. HM performed analyses of the sequence data. BH and HM drafted the article. All authors contributed to the development of the paper, interpretation of the results and improvement of the article.