Selective Stimulation of Duplicated Atlantic Salmon MHC Pathway Genes by Interferon-Gamma

Induction of cellular immune responses rely on Major histocompatibility complex (MHC) molecules presenting pathogenic peptides to T cells. Peptide processing, transport, loading and editing is a constitutive process in most cell types, but is accelerated upon infection. Recently, an unexpected complexity in the number of functional genes involved in MHC class I peptide cleavage, peptide transport, peptide loading and editing was found in teleosts, originating from the second and third whole genome duplication events. Salmonids have expanded upon this with functional duplicates also from a fourth unique salmonid whole genome duplication. However, little is known about how individual gene duplicates respond in the context of stimulation. Here we set out to investigate how interferon gamma (IFNg) regulates the transcription of immune genes in Atlantic salmon with particular focus on gene duplicates and MHC pathways. We identified a range of response patterns in Atlantic salmon gene duplicates, with upregulation of all duplicates for some genes, like interferon regulatory factor 1 (IRF1) and interferon induced protein 44-like (IFI44.L), but only induction of one or a few duplicates of other genes, such as TAPBP and ERAP2. A master regulator turned out to be the IRF1 and not the enhanceosome as seen in mammals. If IRF1 also collaborates with CIITA and possibly NLRC5 in regulating IFNg induction of MHCI and MHCII expression in Atlantic salmon, as in zebrafish, remains to be established. Altogether, our results show the importance of deciphering between gene duplicates, as they often respond very differently to stimulation and may have different biological functions.


INTRODUCTION
Interferon gamma (IFNg) is a key activator of viral immunity and induces signaling through the JAK/ STAT pathway (1). IFNg binds and activates the interferon gamma receptor, thus initiating phosphorylation of Janus kinase 1 (JAK1) and JAK2 that again phosphorylate and activate the signal transducer and activator of transcription proteins (STAT) 1 and possibly STAT2. This activated STAT homo-or hetero-dimer then translocates to the nucleus where it binds to promoter gamma interferon activation sites (GAS) in IFNg inducible genes. One of the first elements induced by this STAT complex is IRF1, which collaborates with STAT in further transcriptional regulation of genes.
Activation of the JAK/STAT pathway induces transcription of a range of genes that are important for the cellular defense against virus infection, such as the MHC genes and other genes involved in antigen processing and presentation. Classical MHC class I (MHCI) genes are responsible for surveilling the intracellular space and present peptides from self-and nonself to CD8 positive T-cells thereby facilitating detection of virus-infected cells by the immune system (2). The MHCI molecule consists of an alpha chain non-covalently linked to beta2-microglobulin (b2m), where the alpha 1 and alpha 2 domain of the alpha chain are responsible for binding of self and non-self peptides. These two domains are also the ones with highest allelic diversity. The term classical refers to peptide binding MHC genes with high allelic diversity and a specific expression pattern such as the human HLA-A, HLA-B, and HLA-C genes with 3794, 4648, and 3503 registered protein alleles, respectively (3).
Expression of classical MHC genes is controlled by both transcriptional and posttranslational mechanisms. The canonical MHC promoter consists of a SXY sequence module that is bound by enhanceosome transcription factors consisting of RFX5, RFXAP, RFXANK, CREB/ATF1, NF-Y's and the MHCI transactivator NLRC5 or the MHCII transactivator CIITA (4,5). This SXY motif is also present in b2m, TAP and LMP2 promoters, suggesting a similar transcriptional enhanceosome control (6). Additional MHCI promoter elements consist of interferon stimulated response elements (ISRE) and GAS recognized by interferon response factors (primarily IRF1) and STAT1/ STAT2 binding [reviewed in (7)]. Some MHCI promoters also have other motifs that can be activated by NF-kB, upstream stimulatory factors (USF) and SP1 binding (8).
Peptide loading of MHCI molecules is a multi-step process starting with the degradation of peptides in cytosol. Upon IFNg stimulation, an immunoproteasome is induced through replacement of the constitutive proteasome components PSMB5, −6, and −7 with interferon stimulated immune-proteasome components PSMB8, −9 and −10 (9). Peptides produced by the immunoproteasome have increased affinity for both the TAP1/TAP2 channel transporting the peptide into the endoplasmic reticulum (ER), as well as for MHC class I molecules. Two additional interferon-induced subunits called PSME1 and PSME2 provide an added regulatory element to the core immunoproteasome with debated effect on the MHCI peptide repertoire (10).
Peptides produced by the immunoproteasome are transported into the ER by an MHC-peptide specific transporter consisting of the TAP1 and TAP2 heterodimer [reviewed in Neefjes et al. (11)]. An empty MHCI /b2m molecule, stabilized by CALR and ERP57, is linked to the TAP transporter by a tapasin (TAPBP) molecule and this complex is called the peptide-loading complex (11). As the optimal length of MHCI peptides varies between 8 and 12 residues, some peptides need further trimming by ERAP1 and ERAP2 molecules to fit perfectly into the MHCI groove (12). The peptides may also be exchanged for better fit peptides by the tapasin-related molecule (TAPBPR) (13,14) before the peptide-loaded MHCI molecule is transported to the cell surface for recognition by T-cells.
Whole genome duplication (WGD) is a widespread phenomenon in animals and plants. All vertebrates have experienced two rounds of whole genome duplications called the 2R-WGD. In teleosts, there has been a third WGD (3R-WGD), occurring ∼350 million years ago separating the bony fishes from other ray-finned fishes (15). Functional remnants of both the 2R-WGD and the 3R-WGD are still present in teleosts today (16). Salmonids and carp experienced a more recent fourth WGD (4R-WGD), further adding to the gene complexity in these species (17,18). At least in salmonids, many of the gene duplicates from 2-4R-WGD have been retained as functional copies (17).
We recently identified an unprecedented gene complexity in the antigen processing and presentation machinery in Atlantic salmon (16). For example, calreticulin (CALR), ERP57 and TAPBP all exist in functional gene duplicates originating from the second, third and fourth WGD. The TAPBP clade includes both previously identified TAPBP and TAPBP-related (TAPBPR) genes in addition to a newly identified TAPBP_like (TAPBPL) gene with low sequence identity to both TAPBP as well as TAPBPR gene sequences. Some genes are also present in single copies such as the classical Atlantic salmon MHC class I gene UBA, while its molecular partner, b2m, has expanded to 12 genes (16). Together, this provides an unprecedented peptide loading complexity of unknown functional relevance. Based on the observation that neo-functionalization of gene duplicates in Atlantic salmon was more frequent than sub-functionalization (17), this raises many questions regarding the function of these gene duplicates.
While Atlantic salmon UBA only exists as a single gene, there are currently 48 registered alleles for this locus in the IPD-MHC database (19). Atlantic salmon and rainbow trout UBA genes have promoter regions with regulatory elements consistent with IFNg induction (20,21). The UBA gene belongs to the MHC class I U lineage, alongside non-classical genes, all assumed to bind peptides originating from the peptide loading machinery (22). One additional MHCI lineage, the Z lineage, is also assumed to bind peptides, while the remaining four MHCI lineage (S-, L-, H-, and P-) bind other, as yet undefined, ligands.
Although we know that teleosts have expressed remnants of the second, third and fourth WGDs, no studies have so far investigated how individual duplicates of given pathways respond to stimulation. Here we use recombinant IFNg stimulation to understand how individual Atlantic salmon gene duplicates respond, with particular focus on genes involved in MHC pathways.

Recombinant Interferon Gamma Production
DNA, consisting of the mature open reading frame of the Atlantic salmon IFNg sequence NM_001171804.1, inserted into the pET151/D-TOPO vector was purchased from GeneArt (Thermo Fisher Scientific, Oslo, Norway). The DNA clone was transfected into E.coli BL21 DE3 (Thermo Fisher Scientific, #C600003).
Recombinant protein was isolated and purified following an established protocol (23). In short, 0.2 mM IPTG was used in the induction and the culture was grown at 28 • C. The bacteria were harvested and sonicated on ice in lysis buffer [50 mM NaH 2 PO 4 , 300 mM NaCl, 10% glycerol (pH 8.0)], and the recombinant protein was purified on a Ni-NTA resin column (QIAGEN, Oslo, Norway). The purity of the recombinant IFNg (rIFNg) was checked on a 4-12% precast SDS-PAGE gel (Thermo Fisher Scientific) stained with SimplyBlue (Thermo Fisher Scientific). Protein concentration was measured with a Qubit Protein Assay kit (#Q33211, Thermo Fisher Scientific).

Recombinant Interferon Gamma Stimulation
Salmon head kidney cells (SHK-1) (24) in passage 48 were seeded into 25 cm 2 flasks using trypsin/EDTA totalling 4 × 10 5 cells per each of the 18 flasks, and grown at 20 • C overnight in Leibovitz L15 medium with L-glutamine (Merck AS, Oslo, Norway, L1518-500ML) supplemented with 5% FBS and gentamicin. The next day RNA was isolated from five flasks and used as negative controls (0 h post stimulation samples; 0-hps), six flasks were stimulated with rIFNg at 1 ng/ml for 2 h (2-hps samples) and six flasks were stimulated with rIFNg at 1 ng/ml for 24 h (24-hps samples).

Cell Lysates
Lysates from one SHK-1 cell flask per time point were used to verify effect of rIFNg. Culture media was removed and cells were washed twice with ice cold PBS supplemented with 1 mM protease inhibitor AEBSF (Merck AS, #A8456). Cells were then lysed in the flask using 0.5 ml lysis buffer (150 mM NaCl, 20 mM Tris-HCl pH 8.0, 1 mM MgCl2, 2% Igepal (Merck AS, #I3021) and 1 mM AEBSF for 10 min (min) on ice. Lysates were incubated on ice with occasional rigorous shaking for 20 min, centrifuged at 13,000 × g for 30 min and supernatants containing the crude lysates transferred to a new Eppendorf tube prior to freezing at minus 20 • C.

Western Blotting and Immunostaining
Twelve µl of the lysates were added to 4x SDS reducing sample buffer prior to denaturing at 80 • C for 5 min. Samples were loaded on a BOLT 4-12% Bis-Tris gel (Thermo-Fisher Scientific #NW04125BOX), run for 20 min in 1xMES buffer (Thermo Fisher Scientific, # NP0002). The gel was blotted onto a 0.45 µm Immobilon-P PVDF membrane (Merck AS, Oslo, Norway #IPVH00010) for 70 min at 100 V in a 12.5 mM Tris-Hcl, 96 mM glycine, 20% ethanol solution. The blot was subsequently stained for 1 h in PBS 0.1% tween with undiluted monoclonal supernatant of primary antibodies against UBA (25)

RNA Isolation
Cells from five flasks from each time point were washed twice with ice cold PBS, trypsinized using 1 ml 0.25% EDTA-Trypsin and transferred to Eppendorf tubes. Cells were harvested at 2,500 × g for 5 min at 4 • C, and resuspended in 350 µl RTL buffer supplied in the RNeasy Mini kit (QIAGEN, #74104). RNA was isolated following the manufacturer's protocol, including a DNAse I treatment step (QIAGEN, #79254) on spin-column. RNA quantity and quality were analyzed using Nanodrop (Thermo Fisher Scientific), Qubit (Thermo Fisher Scientific), and Bioanalyzer (Agilent, Santa Clara, CA, USA) according to manufacturers' protocols. RNA quantity ranged from 78 to 196 ng/µl with absorbance 260/280 ranging from 2.07 to 2.14 and RIN values of 7.0-10.0.

RNA-seq Library Preparation and Sequencing
RNA-seq libraries were prepared using TruSeq stranded total RNA prep kit (Illumina Norway AS) targeting the poly-A tail to enrich mRNAs. 15 barcoded libraries (5 control; 5 rIFNg stimulated at 2 h time point; 5 rIFNg stimulated at 24 h time point) were pooled together and sequenced over 2 lanes of HiSeq 3000 (Illumina Norway AS) employing 150 bp paired-end sequencing. Library preparation and sequencing was performed at the Norwegian Sequencing Center, Oslo, Norway.

GO Term Analysis
Differentially expressed genes with a fold-change of 2 or more were used to perform a gene ontology (GO) analysis using DAVID bioinformatic resources (31), version 6.8. Gene lists were investigated for enrichment against terms in the following GO term lists: GOTERM_BF, GOTERM_MF and INTERPRO, otherwise using default parameters. Data are shown in Supplementary File 4.

Fragments per Kilobase of Transcript per Million Mapped Reads (FPKM) Calculations
Cleaned RNA-seq reads were aligned to the Sasa-UBA * 0201 and Sasa-UBA * 0301 alpha 1 domain nucleotide sequences and the fragments aligning to these gene sequences were counted as above. FPKM values were calculated using the number of fragments aligned, the total number of successfully aligned fragments to the entire transcriptome and the gene length (261 nucleotides) using the standard formula: FPKM = (fragments_aligned * 10 ∧ 9)/(total_number_of_successfully _aligned_fragments * gene_length).

Data Mining
Both GenBank nucleotide and protein databases were mined for mammalian and teleost gene orthologs. Gene duplications were then identified through tblastn search of the Atlantic salmon genome. Orthology was tested using clustal alignments (32) and phylogenetic analyses using MEGA (33). Gene homeologs were defined based on regional location (17). We use standardized nomenclature for gene duplicates originating from the fourth salmonid-specific 4R-WGD where one homeolog is given the extension -a and the other homeolog is given the extensionb. Some gene duplicates mostly originating from the 3R-WGD have been given the extension -like or -L, while gene copies of unknown location have consecutive numbers.

Data Generation and Processing
To verify the effect of the rIFNg and select sampling time points for RNA-seq, we first evaluated the expression of MHC class I UBA and b2m protein levels in rIFNg stimulated cells. Salmon head kidney (SHK-1) cells (24) were stimulated with rIFNg for 2 or 24 h and cell lysates were immunoblotted using available monoclonal antibodies (25,26). We observed little visual increase at 2 h post stimulation (2-hps), but after 24 h post stimulation (24-hps) both UBA as well as b2m protein levels were visually increased compared to negative controls (Supplementary File 1). Thus, we chose to perform the RNA-seq study using 2-hps to detect early response genes and 24-hps to represent later responses.
RNA-seq generated 66-140 million 150 base pair paired-end reads of which more than 98% (65-139 million) of the reads were retained after trimming and removing low quality bases and reads mapping to the sequencing adapters and PhiX (standard Illumina spike-in used sequencing). 84-92% of the cleaned reads aligned to the Atlantic Salmon genome. Principal Component Analysis (PCA; Figure 1) of the normalized count data (using DESeq2) showed a clear separation between the three groups of samples and grouping within the replicates. This indicates that the biological variability is the main source of variance in the data and the experiment and RNA-seq were performed as expected.

Differentially Expressed Genes
Differentially expressed genes (DEG) were identified between groups with the cut-off set at a minimum log 2 fold change difference of one and adjusted p < 0.05. At 2-hps, a considerable number of genes were both upregulated as well as downregulated in our material (Figure 2). The major

Our Cellular Model System
Based on cellular enzyme activity pattern and the ability to internalize inactivated bacteria, SHK-1 cells were originally described as macrophage-like, but with the reservation that their elongated, flattened morphology and lack of bactericidal activity were less consistent with this phenotype (24,34). Later, a monoclonal antibody raised against lysates of virus-infected SHK-1 cells was found to react stringently with endothelial and red blood cells in Atlantic salmon tissues (35). Thus, we evaluated the expression of endothelial-specific gene transcripts. We found that SHK-1 cells express moderate to high levels of many typical endothelial-enriched genes [CDH5.1, KDRL, DUSP5, PAR1; (36-38); Supplementary File 3], but miniscule amounts of Von Willebrandt factor (VWF).
We here use miniscule expression as below 20 matching reads, low expression as 21-150 matching reads, medium expression as 151-1,000 matching reads and high expression as above 1,000 matching reads. Endothelial cells are a heterogeneous group of cells where expression profiles vary depending on type and tissue origin potentially explaining the lack of VWF (39). Also, extracting ECs from their natural environment and placing them in culture also influence their expression profiles. As shown by the data presented below, the SHK-1 cells do respond as would be expected of endothelial cells upon IFNg induction with strong upregulation of CXCL10, CXCL11 and guanylate-binding protein 1 (GBP1) (40). However, because our current knowledge about transcription profiles of teleost cells is limited, a definite classification of this cell line is currently not possible.

GO Term Enrichment
A GO analysis confirmed a prototypic response to interferon gamma. The list of genes upregulated 2 hps was enriched for terms associated with transcriptional regulation and interferonsignaling. At 24 hps we also observed enrichement for terms related to the proteasome and cell cycle regulation. Enriched terms for each contrast are detailed in Supplementary File 4.

Activation of the JAK/ STAT Pathway
An obvious limitation of using the Atlantic salmon genome as a reference for our analysis is the lack of nomenclature discriminating between gene duplicates. Thus, to ensure detection of all duplicated genes, we performed homology search, phylogenies and sequence identity analyses of genes from various Atlantic salmon, other teleost and human gene sequences for each gene of interest ( We first evaluated the expression and regulation of genes involved in the JAK/STAT pathway. Four JAK1 genes and three JAK2 genes are present in Atlantic salmon (Supplementary Files 5-7). The ssJAK1 and ssJAK1L duplication occurred in an ancestor of pike where other teleosts do not seem to have this gene duplication. The salmonid specific 4R-WGD then duplicated each of these 3R-WGD genes into ssJAK1a and ssJAK1b, and ssJAK1.La and ssJAK1.Lb (Supplementary Files 6, 7). For JAK2, two genes are 4R-WGD homeologs (ssJAK2.1a, ssJAK2.1b) and an additional duplicate was denoted ssJAK2.2. An expected homeolog to the ssJAK2.2 gene seems to have been lost on chromosome 20. In our dataset, the ssJAK1a, ssJAK1.La and ssJAK2.2 genes all have more than 490 matching transcripts at 0-hps, while the remaining four genes are supported by < 120 matching reads ( Table 1, Supplementary File 3). Only ssJAK1.La displays a log 2 fold change above 2 at 24-hps while the other two genes are downregulated at 24-hps. As JAK1 and JAK2 are activated by IFNg through phosphorylation of pre-existing as well as newly synthesized JAK proteins, three of the seven JAK1/2 genes potentially influence downstream signaling in response to rIFNg.
Four duplicate STAT1 genes have been reported in rainbow trout (41). Only three of these (STAT1.1a, STAT1.2a and STAT1.2b) are annotated in the Atlantic salmon genome, where the Atlantic salmon chromosome 17 ortholog to the trout STAT1a2 gene may or may not be a pseudogene (Supplementary Files 5-7). Three previously reported Atlantic salmon ssSTAT2 sequences all cluster with our ssSTAT2a sequence suggesting sequence polymorphism in this gene [AHN82335.1, AHN82336.1, ACM43809.1 (42)]. The annotated genome ssSTAT1.2a and ssSTAT2a sequences have truncated 3' regions, but that may or may not be a genome artifact. Moreover, for the ssSTAT1.2a gene, the genome contains more exons than included in the annotated sequence. If the ssSTAT2b gene is one the verge of pseudogenization, thus lacking the dimerization and phosphorylation region, it would match the findings of Dehler and co-workers that salmonids only have one STAT2 gene (41), and also explain the low number of reads found for this gene in our study.
Only ssSTAT1.2a and ssSTAT2a display moderate to high expression levels at all time points, while the remaining four STAT genes only have < 61 matching transcripts ( Table 1, Supplementary File 3). ssSTAT1.2a is more affected by rIFNg than the ssSTAT2a gene where ssSTAT1.2a reaches a 2.8 log 2 fold increase at 24-hps as compared to 0.7 for ssSTAT2a. Although ssSTAT2a has a six times higher 0-hps level than ssSTAT1.2a, it reaches approximately the same number of matching transcripts as ssSTAT1.2a at 24-hps. Thus, one of four ssSTAT1 genes and one of two ssSTAT2 genes respond to rIFNg stimulation with potential effect on downstream JAK/STAT pathway genes. In a chinook salmon cell line, STAT2 knockout did not affect downstream signaling following IFNg stimulation, suggesting that STAT1 operated as a homodimer (41). Instead, STAT2 was essential for the type I interferon pathway.
Following translocation to the nucleus, these phosphorylated STATs induce transcription of a range of genes, amongst them the IRFs (8). Indeed, IRF1 is one of the highest induced genes in our material. 0-hps transcription levels of the 4R-WGD ssIRF1a and ssIRF1b homeologs are limited to a few transcripts, but reach a 5-6 log 2 fold increase after 2-hps ( Table 1, Supplementary Files 3, 5-7). The expression levels of both genes decrease at 24-hps indicating efficient negative regulation. ssIFR1 is thus the major factor affecting downstream responses as duplicate ssIRF9 genes are barely expressed (Supplementary File 3).

Interferon Stimulated First Line of Defense
The stimulatory effect of interferon gamma needs to be transient and is therefore heavily regulated by suppressor of cytokine signaling 1 (SOCS1), which turns off the JAK phosphorylation in turn reducing the phosphorylation of STATs (43). SOCS1 is also an inhibitor of the IFN signaling pathways in teleosts (44). There are five SOCS1 duplicates in the Atlantic salmon genome, all residing on unplaced scaffolds (Supplementary Files 3, 5). Three of these duplicates are heavily induced in our dataset (SOCS1.1-1.3), while the remaining two duplicates show no expression (SOCS1.5 and SOCS1.6) ( Table 2

, Supplementary Files 3, 5-7).
As expected, expression of SOCS1 genes peak early after IFNg stimulation, with a 5-6 log 2 fold increase at 2-hps with a decline to log 2 fold increase of 4 at 24-hps. IFNg is produced by activated T cells, NKT cells and epithelial cells, amongst other in response to viral infection, and activates anti-viral genes through the JAK/ STAT pathway discussed above. One such viral resistance gene is the Myxovirus resistance (MX) gene, of which Atlantic salmon has 10 gene copies as opposed to the two MX genes in the human genome (45) (Supplementary Files 5-7). Eight duplicates are induced by interferon gamma in our material, ranging from 1.4 to 7.6 log 2 fold increase with ssMX8 being the most affected gene (Table 2, Supplementary File 3). This is in line with results from a previous study (45), where ssMX8 also was strongest regulated by rIFNG stimulation as opposed to MX1-3 genes that displayed a stronger response to type I IFN. Also two rainbow trout sequences clustering with this ssMX8 sequence (onmy-MX5 and onmy-MX6, Supplementary File 6), are more stimulated by IFNg than IFNa (Supplementary File 6) (46).
IFNg stimulation also stimulates the production of chemokines that attract immune cells to infected sites. Three Atlantic salmon chemokines all denoted CCL19like in the genome, belonging to the teleost CCL19/21/25 clade clustering with human CCL19, CCL21 and CCL25 sequences (47,48), are highly induced in our dataset ( Table 2, Supplementary Files 3, 5-7). The CC chemokine we named ssCK13b peaked with a log 2 fold induction of 6.2 at 2-hps. Its 4R-WGD homeolog ssCK13a does not peak at 2-hps, but continues to a log 2 fold increase of 8 at 24-hps. The third CC chemokine, called ssCK10 (47), is a single copy gene and reaches a log 2 fold induction of 6.4 at 24-hps. Which cell types the CK13 chemokines attract remains unknown, but a teleost CK10 ortholog (CCL19-like; AVH76855.1) was shown to attract lymphocytes and monocytes/macrophages, with no significant effect on neutrophil migration (48). In humans, the IFNg induced CXC chemokines CXCL9-11 mediate recruitment of T cells, natural killer cells and monocytes/macrophages to the infection site (49). A trout CXCL11_L1 gene (alias γIP-10) has previously been shown to be interferon gamma inducible with ISRE and GAS elements in its promoter (50)(51)(52). In zebrafish, CXCL11 was also shown to attract macrophages in zebrafish embryos through CXCR3 signaling (53). In our model system, the three regionally duplicated CXC chemokines ssCXCL11_L1.1, ssCXCL11_L1.2, and ssCXCL11_L1.3, all peaked with log 2 fold increased transcription of 7 at 2-hps while induction levels decreased at 24-hps (Table 2, Supplementary Files 3, 5-7).
Induction of MX, chemokines and SOCS are all expected following IFNg stimulation and validate a biological function for our rIFNg. A gene that has not been studied extensively in relation to IFNg stimulation is interferon induced protein 44 (IFI44). In humans, IFI44 has been reported to be induced by interferon I but not interferon gamma (54). Its exact function remains unresolved, but in mammals it has both been reported to support as well as suppress viral replication (55,56). In teleosts, IFI44 has diversified to 20 IFI44 and IFI44-like (IFI44.L) genes in zebrafish (57) and 12 IFI44 and IFI44.L genes in Atlantic salmon (Supplementary Files 5-7). The two duplicated human IFI44 and IFI44.L genes are orthologs of the teleost IFI44 gene, and not the IFI44.L genes. Spotted gar, a species that only experienced two WGDs, also has an expansion with at least 11 IFI44 genes, but no detectable IFI44.L genes (data not shown). The IFI44.L genes thus seem to have emerged in a basal teleost resulting from the teleost specific 3R-WGD.
IFI44.L genes showed the highest log 2 fold change of all genes in our study (Table 2, Supplementary File 3). The ssIFI44.L1a gene on chromosome 12 has only a few transcripts at time zero, but reaches a log 2 fold increase of 16 at 24-hps. The four regionally duplicated ssIFI44.L1 genes on chromosome 22 (IFI44.L1b1-b4) are also highly upregulated with log 2 fold increase of 4.5-10.9 at 24-hps. In comparison, three of the six ssIFI44 genes display moderate log 2 fold increase of 1-3 at 24hps. In zebrafish, unfortunately only one of the ten IFI44.L genes (ENSDARG00000010729) were included on the microarray used, displaying a log 2 fold increase of 3 upon virus infection (57). The extreme induction of IFI44.L in Atlantic salmon calls for further studies on the biological role of these IFI44.L genes in teleosts.
Other typical IFNg inducible genes are also affected by rIFNg in our material, including the fourteen interferoninduced guanylate-binding protein 1 (GBP1) genes, where the highest induced gene duplicate reached a log 2 fold increased transcription of 5.3 at 24-hps (Supplementary File 3).

MHC Class I Pathway Genes
The induced protein levels of the classical MHCI gene UBA at 24-hps used to verify the biological function of our recombinant IFNg, was reflected in the transcript levels. UBA is highly transcribed at 0-hps, but still has a log 2 fold increase of 2.3 at 24hps (Table 3, Supplementary File 3). A few non-classical MHCI genes also have log 2 fold values above 2, including the U lineage gene ULA and the L lineage genes LGA and LIA, although both their 0-hps and also 24-hps transcript levels are much lower than the UBA locus.
The canonical W/S, X2 and Y/Enhancer B promoter motifs known to bind the mammalian MHCI and b2m enhanceosome (4), are also found in many teleost MHC class I promoter sequences (20,21). In our study, only a few of these Atlantic salmon transcription factor duplicates, i.e., ssRFXAPa and ssRFXANK display a moderate rIFNg upregulation at 24hps (Table 1, Supplementary File 3). In humans, the IFNg stimulated master regulator NLRC5 controls the SXY induced transcription of many antigen presentation pathway genes including MHC class I and b2m (4,58). The Atlantic salmon ssNLRC5b displays a log 2 fold increase of 4.4 at 24-hps, while its 4R-WGD homeolog is barely transcribed. We find no transcripts for the duplicate ssRFX5 homeologs and only low transcript levels for one of the nine ssNFYA/B/C subunit genes (ssNF-YB.1a, Supplementary File 3), suggesting the enhanceosome does not affect IFNg induced transcription of antigen presenting genes in this model system. This is opposed to the situation in rats, where lack of RFX5 severely impairs MHCI expression (6). Various mammalian MHCI genes can use other transcription factors such as STATs, IRFs, SP1, USFs, AP1 and NF-kβ (59, 60). In our model system, we find no matching reads for the single SP1 gene and only low transcript levels for one of the seven USF1 and USF2 genes (USF2.2a) (Supplementary File 3).
For the NF-kβ family, five of eight genes display moderate to high expression levels (NF-kβ_p65a, NF-kβ_p65b2 and NF-kβ_p105b; Supplementary File 3).
Without a functional enhanceosome, ssSTAT1 and ssIRF1 transcription factors seem likely candidates for the UBA induction (Table 1, Supplementary File 3). Multiple sites for STAT/IRF1 transcription factor binding are present in Atlantic salmon and rainbow trout UBA promoters with both GAS and ISRE elements (20,21). A similar IFNg induction of MHCI by STAT1 and IRF1 has been reported in humans (8,61).
The SHK-1 cells are heterozygous for the Sasa-UBA * 0201 and Sasa-UBA * 0301 alleles (21). However, alignment to the completely homozygous Atlantic salmon genome will only list reads matching the Sasa-UBA * 0301 allele. The two SHK-1 UBA alleles have 65 percent sequence identity in their alpha 1 domain, but 100 percent sequence identity in the alpha 2 and downstream regions. Thus, the majority of matching reads would not discriminate between alleles. To identify expression levels for each allele we calculated the FPKM values using the alpha 1 domain sequences only. As seen in the genome based DEG analysis, the transcript levels of both alleles were substantial even at 0-hps with FPKM values of 536 and 1,057 for the UBA * 0301 and UBA * 0201 alleles, respectively. Both alleles were equally induced at 24-hps reaching high average FPKM values of 2,191 and 4,440 for the UBA * 0301 and UBA * 0201 alleles, respectively. Similarly, high basal levels of MHCI transcripts in unstimulated cells were also found in mouse lymphatic endothelial cells (62).
Following transcription and translation, newly synthesized UBA molecules need assistance by CALR and protein disulfide isomerases (ERP57 alias PDIA3) to achieve proper folding. This gene is present in six copies in Atlantic salmon originating from the 2R-, 3R-, and 4R-WGD [Supplementary File 6 (16)], but none are largely affected by rIFNg. ssERP57 and ssERP57L genes originate from the 2R-WGD, ssERP57L1 and ssEP57L2 are remnants of the 3R-WGD while the salmonid specific 4R-WGD duplicated the ssERP57 and ssERP57L2 genes (16). ERP57a and ERP57b are highly expressed in normal tissues as well as in untreated SHK-1 cells (Supplementary File 3). None of these five genes display a log 2 fold change above 1.3 (Supplementary File 3).
Atlantic salmon b2m sequences originating from 12 different loci, segregate into two distinct clades here defined as the BA1 (ssb2m1-5) and the BA6 clades (ssb2m6-12) (Supplementary Files 5-7). Due to high b2m sequence identity between individual gene sequences, it is problematic to match reads to individual transcripts. Instead, we compared transcription levels of the two clades. The BA6 clade has four times more transcripts at time zero, but only reaches a log 2 fold increase of 0.65 at 24-hps as compared to the 1.66 log 2 fold induction of sequences from the BA1 clade ( Table 3

, Supplementary File 3).
We assume that most MHCI lineage genes are non-covalently associated with b2m, but we do not know if the two b2m clades represent molecules with functional preference for different MHCI molecules. Looking at the b2m promoters, the dual b2m clustering is also apparent in the promoter regions (Supplementary File 5). ssb2m1-5 and ssb2m7 have 930 or more completely identical nucleotide sequences upstream of the start codon and should thus respond equally to IFNg stimulation. b2m8, b2m11, and b2m12 also share 500 bp of identical promoter nucleotide sequence upstream of the start codon and should also mount similar responses to stimulation. Nevertheless, distant transcription factor elements and enhancers may also influence transcription, possibly explaining the differences in IFNg induction observed in our data set.
In mammals, IFNg stimulates replacement of the constitutive proteasome subunits PSMB5-PSMB7 by the subunits PSMB8-10 representing the immunoproteasome (9). This shift in proteasome components produces peptides preferably with hydrophobic or positively charged residues at the C terminus, which are optimally suited for binding to the antigen transporter TAP and to MHC Ia molecules. In addition to the PSMB8 subunit, teleosts have an additional unique PSMB9 subunit denoted PSMB12 as well as an additional PSMB10 subunit denoted PSMB13 (63). In Atlantic salmon, the PSMB8-13 subunit genes all reside as copies in both duplicate MHCI regions on chromosome 14 and 27 (16) (Supplementary File 8). All these duplicate Atlantic salmon PSMB subunits display a log 2 fold increased expression level of 1.8-3.0 at 24-hps with ssPSMB8b and ssPSMB13b being most affected (Table 3,  Supplementary File 3). Overall, slightly higher transcript levels are seen for ssPSMB subunits from the Ib region than those linked to the UBA gene on chromosome 27. Sequence identity between -a and -b duplicates is 92-100% (Supplementary File 7), increasing the risk of individual reads being matched to the wrong gene duplicate. In contrast, the sequence identity between the ssPSMB9 vs. ssPSMB12 and ssPSMB10 vs. ssPSMB13 is 48-58%, thus there is minimal risk of errors in matching reads. As opposed to zebrafish and chicken (63,64), no functional MHCI haplotypes exist in Atlantic salmon, displaying no allelic variation in the regionally linked genes (16).
The functional roles of the dual PSMB9 vs. PSMB12 and PSMB10 vs. PSMB13 molecules in teleosts are unclear, and the Atlantic salmon duplicated -a and -b versions of these four molecules adds another layer of complexity. In mammals, there are intermediates between the standard proteasome complex and the immunoproteasome with varying number of IFNg induced subunits (65), suggesting this could also be the case for teleosts. Assuming the different ssPSMB9 vs. ssPSMB12 and ssPSMB10 vs. ssPSMB13 subunits are inter-replaceable, Atlantic salmon has 32 alternative (immune)proteasomes. Evaluating the transcription and induction of these molecules, only ssPSMB8a and ssPSMB9a genes display low expression levels even at 24hps ( Table 3, Supplementary File 3), potentially reducing the number of different (immune)proteasome combinations to 12.
Miniscule number of reads in our data sets match the single ssPSMB7 gene (Supplementary File 3), which is peculiar, as the protein encoded by this gene is a component of the constitutive proteasome One possibility is that the interferoninducible duplicate ssPSMB10 replaces the ssPSMB7 subunit also in the constitutive proteasome in Atlantic salmon. The substrate binding pockets of PSMB7 and PSMB10 in humans and zebrafish are essentially identical-showing the same trypsin-like activity (66,67). However, ESTs matching the ssPSMB7 subunit are reported in GenBank (e.g., EG833924.1), so low transcription may be unique to SHK-1 cells.
Examining the other proteasome subunits, the majority have a log 2 fold increase below 1.2 with the exception of ssPSMA6.2a gene, with a log 2 fold value of 2.9 at 24-hps (Supplementary File 3). The functional relevance of such an induction cannot be determined based on transcriptional analysis alone. Three other transcribed ssPSMA6 genes exist, displaying a high number of matching transcripts at all measured time points. Thus, if and how the ssPSMA6.2a subunit induction affects the immunoproteasome is unclear. One of the ssPSMA6 subunits has been reported to be downregulated in SHK-1 cells following rIFNg treatment for 24 h, but the authors used a proteome approach that does not discriminate between duplicate genes (68). Additional regulatory mechanisms could result in only some of these ssPSMB transcripts being translated into functional proteins available for the immunoproteasome, however, this hypothesis would need to be validated by further experimentation.
Also the mammalian regulatory proteasome subunits PSME1 and PSME2 are upregulated by IFNg (69). In Atlantic salmon, there are 4R-WGD homeolog genes for the ssPSME1, ssPSME2 and ssPSME3 subunits with an additional third copy of the ssPSME3 gene (Supplementary File 5). ssPSME1b and ssPSME2b respond in a similar manner as their mammalian counterparts with log 2 fold increase of 1.8 and 2.0 at 24-hps, respectively (Table 3, Supplementary File 3). The remaining PSME subunits display a log 2 fold increase of 0.4-1.4.
Once peptides are generated by the (immuno)-proteasome, they are transported into the ER by the TAP1/TAP2 channel where both subunits originate from IFNg inducible genes in humans (70). There are three ssTAP2 genes and one ssTAP1 gene in Atlantic salmon. ssTAP2a and ssTAP2b reside closely linked to the U-lineage genes on chromosome 27 and 14 (Supplementary Files 3, 7, 8), while the gene denoted ssTAP2c resides 9 Mb upstream on chromosome 14 and the ssTAP1 genes resides on chromosome 5. The ssTAP1 gene displays a log 2 fold induction of 2.6 at 24-hps ( Table 3

, Supplementary File 3).
Just looking at fold increase, the ssTAP2a gene linked to UBA on chromosome 27 has a log 2 fold increase of 3.6 at 24-hps while the ssTAP2b gene on chromosome 14 has a log 2 fold increase of 5.2 ( Table 3). However, we find 18 times more transcripts for ssTAP2a than for ssTAP2b (Supplementary File 3). Assuming that all these transcripts are translated into proteins with similar efficiency, our data suggest that ssTAP2a may have a more prominent role in transporting peptides than ssTAP2b. Although upregulated, the ssTAP2c gene has fewer matching transcripts, but with a 50 % sequence identity to the other two TAP2s (Supplementary File 7), its function remains unknown. A previous study demonstrated no polymorphism in Atlantic salmon ssTAP2s (16) as opposed to what is found in zebrafish and chicken (63,71). One could speculate that presence of duplicate homeolog genes has eliminated the need for such allelic diversity.
A study of IFNg responsive elements in rainbow trout TAP2 and PSMB9a promoters showed that the PSMB9a gene required both ISRE as well as GAS elements to respond to stimulation (51). The rainbow trout TAP2 gene on the other hand only required the ISRE element to respond. As IRF1 binds to ISRE elements and STATs bind to GAS elements (8), STAT1 and IRF1 seem to be master regulators of antigen processing and transport in salmonids.
The MHCI molecule, awaiting peptides in the ER lumen, is linked to the TAP1/TAP2 channel by TAPBPs. We previously reported six ssTAPBP and ssTAPBP-like (TAPBPL) genes in addition to one ssTAPBP-related (TAPBPR) gene in Atlantic salmon (16) (Supplementary File 7). We find that the UBA linked ssTAPBPa on chromosome 27 has a log 2 fold increase of 4 at 24-hps as compared to the duplicate ssTAPBPb gene on chromosome 14 that has miniscule number of matching transcripts (Table 3, Supplementary File 3). This is similar to what is found in normal tissues where the ssTAPBPa gene is expressed ∼10 times more than the ssTAPBPb gene (16). An additional ssTAPBPc duplicate on chromosome 14, located 9 Mb upstream closely linked to a non-classical MHCI locus, shows no transcripts. As opposed to that found in chicken (64), Atlantic salmon ssTAPBP genes are not polymorphic (16). The recently discovered TAPBPL sequences are also present in marsupials and birds, but lost en route to humans (16). The 4R-WGD duplicates ssTAPBPL1a and ssTAPBPL1b show a log 2 fold increase of 1.5-1.7 at 24-hps while the ssTAPBPL2 gene has a log 2 fold increase of 3 ( Table 3, Supplementary File 3). As in normal tissues, ssTAPBPL1b displayed the overall highest expression levels in unstimulated cells, but is outnumbered by TAPBPL2 at 24-hps (16). The TAPBPL genes have unknown functions, but share low sequence similarity with both ssTAPBP and ssTAPBPR sequences (Supplementary File 7). As there are no ssTAPBPR transcripts in our study material, the ssTAPBPL sequences may hold a similar function in our model system. In normal tissues, the ssTAPBPR ssTAPBPL1b gene have similar transcript levels (16). Future studies are needed to clarify the function of TAPBPL molecules in teleosts as well as in other vertebrates.
Following peptide loading by the TAP/TAPBP complex, MHCI peptides are further trimmed by the ERAP1 and ERAP2 molecules, both IFNg inducible genes (72). In mammals, the ERAP1 and ERAP2 molecules have different specificities, where a heterodimer creates a complex with superior peptide-trimming efficiency (73). The ssERAP1 and ssERAP2a genes both respond with log 2 fold increase of 2.0-2.4 with four times more ssERAP1 transcripts than ssERAP2a (Table 3, Supplementary File 3). The additional ssERAP2b gene displayed no transcription in our material, but is expressed in normal Atlantic salmon tissues (16).

MHC Class II Pathway Genes
Despite not generally being regarded as antigen presenting cells, some endothelial cells also have IFNg inducible MHC class II (74, 75) although at a much lower level than MHC class I at least in lymphatic endothelial cells (62). In SHK-1 cells, the single classical MHC class II alpha (DAA) and MHCII beta (DAB) genes (76) were induced by rIFNg, but their transcript levels were 100 times lower than that seen for the classical MHCI gene UBA (Table 4, Supplementary File 3). One of the duplicate MHCII chaperone Invariant chain (Ii) genes displayed a similar transcript and induction level.
In mammals, MHCII transcription is also dependent upon the CREB-RFX-NFY enhanceosome with CIITA as the master regulator (77). Although the single ssCIITA gene has a log 2 fold increase of 2.3 at 24-hps, the complete lack of RFX5, NFYA and NFYC transcripts also here point to STAT1 and IRF1 as responsible transcription factors for MHCII transcription (Table 1, Supplementary File 3). Surprisingly, in zebrafish, IFNg induced IRF1 and subsequently CIITA, where IRF1 either alone or in combination with CIITA was sufficient to induced MHCII transcription through binding to ISRE elements in the MHCII promoter (78). The enhanceosome was thus not involved in IFNg induction of MHCII expression. Our findings suggest that this may also be the case for MHCII induction in Atlantic salmon and raises the possibility that instead of CIITA, NLRC5 may collaborate with IRF1 to induce transcription of MHCI. If and how STATs also make a contribution remains to be established.
Cathepsin S and L are essential for MHCII maturation through trimming of Ii, but also contribute to the peptide repertoire available for MHCII (79). In Atlantic salmon there are 10 cathepsin L genes (Supplementary Files 5-7).
Transcripts are highly abundant from one gene only and none of the gene duplicates are affected by rIFNg ( Table 4,  Supplementary File 3). For cathepsin S, the two 4R-WGD homeolog Atlantic salmon genes ssCTSS1a and ssCTSS1b are both induced by rIFNg (Supplementary Files 3, 5-7). They reach log 2 fold values of 1.6 and 3.2 at 24-hps where ssCTSS1a is by far the most transcribed gene. In mammals, IFNg induced cathepsin S transcription is mediated by IRF1 (80), again pointing to a master regulatory role for ssIRF1 in rIFNg mediated induction of Atlantic salmon MHC pathway genes.

Selective Stimulation of Duplicate Genes
Lien et al. (17) found that overall 55 percent of Atlantic salmon 4-WGD duplicates were retained as two functional gene copies where neo-functionalization was more predominant than subfunctionalization. In our material, 41 of 71 homeolog 4-WGD gene pairs analyzed in detail displayed expression of both duplicates where 13 of these had 4 times or higher expression levels for one of the homeologs (Supplementary File 3). Eight gene pairs were not expressed in our material while 22 gene pairs showed expression of one homeolog only. Perhaps not unexpectedly as our model system was a cell line, only one homeolog being expressed was more common for transcription factors (15/33) than for MHCI pathway genes where both duplicates were mostly expressed (19/31). Transcription factors may be differentially regulated in different cell lines, while there may be a functional advantage of having duplicate expressed MHC pathway genes. Comparing log2 fold change for all gene duplicates displays a varied pattern ( Table 5). For JAK and STAT genes, rIFNg has a major impact on only one of the duplicates. For IRF1 and the many of the first line of defense genes, the alternative approach is operational. Both the IRF1 homeologs are highly induced by rIFNg and peak at 2-hps.
Also for the CC and CXC chemokines CK10, CK13, and CXCL11.L genes, all gene duplicates are heavily upregulated where some genes peak at 2-hps and other continue to increase until 24-hps and potentially even longer. This suggest that there is a functional advantage of having several genes responding to stimulation potentially with slightly different responses as seen for CK13a and -b. SOCS1, a negative regulator of the JAK/STAT pathway, has three highly induced genes that peak at 2-hps, thus providing an early regulation of the rIFNg induction.
Also five of the six IRF44.L genes are heavily induced by rIFNg with one peaking at 2-.hps while the remaining displayed increased expression until 24-hps. Only two of the six IFI44 genes have log2 fold values above 2.9. Future studies of IFI44L may explain why these genes are so heavily stimulation.
For the MHC pathway genes, the rIFNg stimulation is less pronounced with log2 fold values below 5.2. All the duplicate MHC linked proteasome genes display similar induction levels, although the ssPSMB8a and ssPSMB9a genes only display low transcription levels. Also the ssPSME1 and 2 genes are expressed in duplicate, but there the expression and induction levels are similar for both genes.
All ssTAP2 genes are expressed, but although the ssTAP2b gene linked to the non-classical genes display the highest log 2 fold change, the UBA-linked ssTAP2a gene has an 18 times higher transcript level. The UBA-linked ssTAPBPa gene is also unique in responding to rIFNg while the -b duplicate has miniscule transcript levels. The three un-linked ssTABPL genes are all induced by rIFNg, where the ssTAPBPL2 gene is most induced. How they affect MHCI peptide loading remains to be defined.

CONCLUSION
Recombinant interferon gamma strongly upregulated a panel of expected early response genes that peaked at 2-hps. The observed transcriptional response could be compatible with several cell types, but suggests that SHK-I cells are not professional antigenpresenting cells, but have a more endothelial-like phenotype. MHC pathway genes had a slower response, with highest induction levels at 24-hps and potentially still rising. Atlantic salmon gene duplicates showed a range of rIFNg response patterns. While rIFNg upregulated all duplicates for some genes such as ssIRF1 and ssIFI44.L, others such as ssTAPBP and ssERAP2 only responded with one or a few of the duplicates.
Lacking expression of both ssRFX5 and ssNFYA/B components, it is unlikely that the enhanceosome was involved in regulation of MHC pathway genes in our material. Instead, ssIRF1 alone or in combination with ssSTAT1 seems to be the master regulator(s) of the IFNg response in Atlantic salmon. The enhanceosome may have a more predominant role in other cell types. If IRF1 also collaborates with CIITA and possibly NLRC5 in regulating IFNg induction of MHCI and MHCII expression in Atlantic salmon as in zebrafish remains to be established. Overall, our results show the importance of deciphering between gene duplicates as they often respond very differently to various stimulations and most likely have very different functions.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://www.ncbi.nlm.nih. gov/, PRJNA637094.

AUTHOR CONTRIBUTIONS
UG was responsible for conceptualization, methodology, investigation, data visualization, formal analysis, project administration, funding acquisition, and prepared the original manuscript. AS contributed with software, formal analysis, data curation, and visualization. JF contributed with formal analysis and data visualization. All authors contributed to manuscript review and editing.

FUNDING
This research was funded by the Norwegian Research Council projects #274635 and #254876.