Discovery and Functional Annotation of Quantitative Trait Loci Affecting Resistance to Sea Lice in Atlantic Salmon

Sea lice (Caligus rogercresseyi) are ectoparasitic copepods which have a large negative economic and welfare impact in Atlantic salmon (Salmo salar) aquaculture, particularly in Chile. A multi-faceted prevention and control strategy is required to tackle lice, and selective breeding contributes via cumulative improvement of host resistance to the parasite. While host resistance has been shown to be heritable, little is yet known about the individual loci that contribute to this resistance, the potential underlying genes, and their mechanisms of action. In this study we took a multifaceted approach to identify and characterize quantitative trait loci (QTL) affecting host resistance in a population of 2,688 Caligus-challenged Atlantic salmon post-smolts from a commercial breeding program. We used low and medium density genotyping with imputation to collect genome-wide SNP marker data for all animals. Moderate heritability estimates of 0.28 and 0.24 were obtained for lice density (as a measure of host resistance) and growth during infestation, respectively. Three QTL explaining between 7 and 13% of the genetic variation in resistance to sea lice (as represented by the traits of lice density) were detected on chromosomes 3, 18, and 21. Characterisation of these QTL regions was undertaken using RNA sequencing and pooled whole genome sequencing data. This resulted in the identification of a shortlist of potential underlying causative genes, and candidate functional mutations for further study. For example, candidates within the chromosome 3 QTL include a putative premature stop mutation in TOB1 (an anti-proliferative transcription factor involved in T cell regulation) and an uncharacterized protein which showed significant differential allelic expression (implying the existence of a cis-acting regulatory mutation). While host resistance to sea lice is polygenic in nature, the results of this study highlight significant QTL regions together explaining between 7 and 13 % of the heritability of the trait. Future investigation of these QTL may enable improved knowledge of the functional mechanisms of host resistance to sea lice, and incorporation of functional variants to improve genomic selection accuracy.


INTRODUCTION
Sea lice are a major concern for salmon aquaculture worldwide, in particular Lepeophtheirus salmonis in the Northern Hemisphere and Caligus rogercresseyi in the Southern Hemisphere. These copepods attach to the skin and feed on the mucus and blood of several species of salmonid fish. Parasitized fish display reduced growth rate and increased occurrence of secondary infections (Jónsdóttir et al., 1992). In addition to a significant negative impact on salmonid health and welfare, lice prevention and treatment costs are a large economic burden for salmonid aquaculture, with global losses of over $430M per year (Costello, 2009). Current control strategies include, for example, feed supplements, cleaner fish, tailored cage design, or "lice-zapping" lasers (Aaen et al., 2015), but these multifaceted strategies are only partially effective. Expensive and potentially environmentally damaging chemicals and treatments are still frequently required to control sea lice populations, which are becoming resistant to common delousing drugs (Bravo et al., 2008;Aaen et al., 2015). Therefore, despite these extensive control efforts, sea lice remain a significant threat to salmon welfare and aquaculture sustainability, and incur in further indirect costs via negative impact on public opinion of aquaculture (Jackson et al., 2017).
Selective breeding can contribute to sea lice prevention via harnessing naturally occurring genetic variation within commercial salmon stocks to identify the most resistant individuals. The identification of selection candidates can be enabled either by pedigree or genomic based approaches, the latter via genomic selection (Meuwissen et al., 2001). Moderate genetic variation in resistance to sea lice exists in Atlantic salmon, with heritabilities typically ranging between 0.1 and 0.3 for both the North Atlantic sea louse (Lepeophtheirus salmonis; Kolstad et al., 2005;Gjerde et al., 2011;Ødegård et al., 2014;Gharbi et al., 2015;Tsai et al., 2016), and the Pacific sea louse (Caligus rogercresseyi; Lhorente et al., 2012;Yáñez et al., 2014;Correa et al., 2017b); and tests of genomic selection approaches for resistance to both lice species have shown substantially higher prediction accuracies that "traditional" pedigree-based approaches (Ødegård et al., 2014;Gharbi et al., 2015;Tsai et al., 2016;Correa et al., 2017a).
Genomic selection is now routinely applied in Atlantic salmon breeding programs for the genetic improvement of several traits (Houston, 2017). While it offers notable benefits in terms of selection accuracy, it has limitations such as the significant cost (via the need for high volume genotyping using SNP arrays), and the limited accuracy when the reference and selection candidate populations are not closely related (e.g., Daetwyler et al., 2012;Tsai et al., 2016). Discovering the causative genetic polymorphisms underlying phenotypic variation in complex traits is a fundamental goal of genetic research. Further, identifying these causative variants would also facilitate more effective genomic selection, potentially via cheaper genoyping strategies, increased genetic gain each generation, and improved persistency of prediction accuracy across generations and populations. Approaches that incorporate biological knowledge into selection models have shown an increase in genomic prediction accuracy, specially across distantly related populations (MacLeod et al., 2016). Further, knowledge of causative variants offer future possibilities of harnessing genomic editing approaches, for example simulations have shown that even a small number of edits per generation can rapidly increase the frequency of favorable alleles and expedite genetic gain (Jenko et al., 2015). However, finding causative mutations within QTL regions is very challenging, with few success stories in farm animals to date. QTL regions tend to cover large segments of chromosomes, and typically contain many variants in linkage disequilibrium that show approximately equal association with the trait. Functional genome annotation data can be applied to prioritize variants within these regions, markedly reducing the number of mutations to investigate by narrowing them to known transcribed and regulatory regions, and -although largely lacking for aquaculture species to date -are currently being developed for Atlantic salmon and other salmonid species (Macqueen et al., 2017). Further understanding of the function of genetic variants, for example through expression QTL studies, can also identify causal associations between genotype and phenotype (Lappalainen, 2015), as previously shown for example for bovine milk composition (Littlejohn et al., 2016) or adiposity in chicken and mice (Roux et al., 2015). While challenging, shortlisting and identification of causative genes and variants impacting on disease resistance in salmon would have positive implications for both selective breeding and fundamental understanding of the host-pathogen interaction.
In the current study, a large population of farmed Atlantic salmon of Chilean origin was challenged with sea lice and high density SNP genotype data was collected. The overall aim of the study was to detect and annotate QTL affecting host resistance to sea lice in farmed Atlantic salmon, with a view to identifying putative causative genes and variants. The specific objectives were a) estimate genetic variance of sea lice resistance in our population, b) dissect the genetic architecture of the trait, and c) explore the genomic basis of the detected QTL using transcriptomics and whole-genome sequencing data.

Disease Challenge
2,668 Atlantic salmon (Salmo salar) Passive Integrated Transponder (PIT)-tagged post-smolts (average weight 122 ± 40 grams, measured for weight and length prior to the start of the challenge) from 104 families from the breeding population of AquaInnovo (Salmones Chaicas, Xth Region, Chile), were experimentally challenged with Caligus rogercresseyi (chalimus II-III). Animals were distributed in three tanks, with 24 to 27 fish of each family in every tank. Infestation with the parasite was carried out by depositing 50 copepods per fish in the tank and stopping the water flow for 6 h during infestation, thereafter water flow was gradually restored reaching its normal flow two days after. During this process oxygen saturation was maintained at 90-110 %, and oxygen and temperature where constantly monitored. Eight days after the infestation fish were sedated before being carefully and individually removed from the tanks. The number of sea lice attached to the fish were counted from head to tail. At this stage, each fish was also measured for weight and length, PIT-tags were read, and fin-clips collected for DNA extraction. Log-transformed lice density was estimated as log e LC BW ini 2/3 , where LC is the number of lice counted on the fish, BW ini is the initial body weight prior to the challenge, and BW ini 2/3 is an approximation of the surface of the skin of each fish (Ødegård et al., 2014). Both raw sea lice counts and sea lice density (lice counts normalized for fish size) were used as proxy of fish resistance to sea lice. Growth during infestation was calculated as [(BW end -BW ini )/BW ini ) * 100], where BW ini and BW end are the weight of the fish at the start and at the end of the trial, respectively, the same formulae was used for length.

Genotyping
DNA was extracted from fin-clips from challenged fish using a commercial kit (Wizard R Genomic DNA Purification Kit, Promega), following the manufacturer's instructions. All samples where genotyped with a panel of 968 SNPs (Supplementary Table  S1) chosen as a subset of the SNPs from a medium density SNP array (Yáñez et al., 2016) using Kompetitive Allele Specific PCR (KASP) assays (LGC Ltd., Teddington United Kingdom). Different low density panels were tested in fish of the same population previously genotyped with the medium density SNP array (Yáñez et al., 2016), and the position of the low density panel markers was selected to maximize imputation accuracy, enriching the extremes of the chromosome were linkage disequilibrium decreases (Yoshida et al., 2018). A population containing full-siblings of the challenged animals had previously been genotyped with a SNP panel of 45,819 SNPs (n = 1,056, Correa et al., 2015;Yáñez et al., 2016), and the experimental lice-challenged population was imputed to ∼46 K SNPs using FImpute v.2.2 (Sargolzaei et al., 2014). Imputation accuracy was estimated by 10-fold cross validation, masking 10% of the 1,056 genotyped full-sibs to the 968 SNP panel, performing imputation, and then assessing the correlation between the true genotypes and the imputed genotypes. All imputed SNPs showing imputation accuracy below 80% were discarded, and the remaining imputed SNPs had a mean imputation accuracy of 95%. Genotypes were further filtered and removed according to the following criteria: SNP call-rate < 0.9, individual call-rate < 0.9, FDR corrected p-value for high individual heterozygosity < 0.05 (removing fish with an excess of heterozygote genotypes), identity-bystate > 0.95 (both individuals removed), Hardy-Weinberg equilibrium p-value < 10 −6 , minor allele frequency < 0.01. After filtering 38,028 markers and 2,345 fish remained for association analysis.

Estimation of Genetic Parameters
Variance components and heritabilities were estimated by ASReml 4.1 (Gilmour et al., 2014) fitting the following linear mixed model: where y is a vector of observed phenotypes (lice number, lice density, initial weight, initial length and weight and length gain during infestation), µ is the overall mean of phenotype records, b is the vector of fixed effects of tank (as factor with 3 levels) and initial body weight (as a covariate; except when initial weight or initial length were the observed phenotypes), a is a vector of random additive genetic effects of the animal, distributed as ∼N (0,Gσ 2 a) where σ 2 a is the additive (genetic) variance, G is the genomic relationship matrix. X and Z are the corresponding incidence matrices for fixed and additive effects, respectively, and e is a vector of residuals. The best model was determined comparing the log-likelihood of models with different fixed effects and covariates. Phenotypic sex was not significant for any of the traits. G was calculated using the GenABEL R package  to obtain the kinship matrix using the method of Amin et al. (2007), which was multiplied by a factor of 2 and inverted using a standard R function. Genetic correlations were estimated using bivariate analysis implemented in ASReml 4.1 (Gilmour et al., 2014) fitting the same fixed and random effects described in the univariate linear mixed model described above.

Single-SNP Genome-Wide Association Study
The single-SNP GWAS was performed using the GenABEL R package  by applying the mmscore function (Chen and Abecasis, 2007), which accounts for the relatedness between individuals applied through the genomic kinship matrix. Significance thresholds were calculated using a Bonferroni correction where genome-wide significance was defined as 0.05 divided by number of SNPs (Duggal et al., 2008) and suggestive as one false positive per genome scan (1/number SNPs). The proportion of variance explained by significant SNPs was calculated as Chi-square / (N-2 + Chi-square), where N is the sample size and Chi-square is the result of the Chi-square test of the mmscore function (Chen and Abecasis, 2007).

Regional Heritability Mapping
A regional heritability mapping (RHM) analysis (Nagamine et al., 2012;Uemoto et al., 2013) was performed where the genome was divided into overlapping regions of 50 consecutive SNPs (according to the most recent Atlantic salmon reference genome assembly GCA_000233375.4) and a step-size of 25 SNPs (two consecutive SNP windows share 25 SNPs) using Dissect v.1.12.0 (Canela-Xandri et al., 2015). The significance of the regional heritability for each window was evaluated using a log likelihood ratio test statistic (LRT) comparing the global model fitting all markers with the model only fitting SNPs in a specific genomic region.

Whole-Genome Sequencing
Genomic DNA from a pool of 50 fish with high sea lice counts (Mean = 66.8 ± 21.5) and a pool of 50 fish with low sea lice counts (Mean = 20.4 ± 5.3) were sequenced in five lanes of HiSeq 4000 as 150 bp PE reads. Family structure was similar in both pools, with 34 different families and a maximum of two fish per family. The quality of the sequencing output was assessed using FastQC v.0.11.5 1 . Quality filtering and removal of residual adaptor sequences was conducted on read pairs using Trimmomatic v.0.32 (Bolger et al., 2004). Specifically, Illumina adaptors were clipped from the reads, leading and trailing bases with a Phred score less than 20 were removed, and the read trimmed if the sliding window average Phred score over four bases was less than 20. Only reads where both pairs were longer than 36 bp postfiltering were retained. Filtered reads were mapped to the most recent Atlantic salmon genome assembly (ICSASG_v2; Genbank accession GCF_000233375.1; Lien et al., 2016) using Burrows-Wheeler aligner v.0.7.8 BWA-MEM algorithm (Li, 2013). Pileup files describing the base-pair information at each genomic position were generated from the alignment files using the mpileup function of samtools v1.4 (Li et al., 2009), discarding those aligned reads with a mapping quality < 30 and those bases with a Phred score < 30. Synchronized files containing read counts for every allele variant in every position of the genome were obtained using Popoolation2 v1.201 (Kofler et al., 2011). A read depth ≥ 10 and a minimum of three reads of the minor allele were required for SNP calling.

Differential Allelic Expression
Since complex traits can be influenced by causative mutations in regulatory regions that impact gene expression (Keane et al., 2011;Albert and Kruglyak, 2015), the sequence data from a previous RNA-Seq study on the skin of a subset of animals taken from this sea lice infected population (Robledo et al., 2018a) were used to investigate allelic specific expression (differential abundance of the allelic copies of a transcript) to reveal putative nearby cis-acting regulatory polymorphisms. Alignment files were produced using STAR v.2.5.2b (Dobin et al., 2013; detailed protocol can be found in Robledo et al., 2018a) and used for SNP identification and genotype calling with samtools v1.4 (Li et al., 2009). Reads with mapping quality < 30 and bases with phred quality scores < 30 were excluded. A read depth ≥ 10 and ≥ 3 reads for the alternative allele were required to call a SNP. Read counts were obtained for each allele in heterozygous loci and a binomial test was performed to assess the significance of the allelic differences using the R package AllelicImbalance (Gådin et al., 2015).

Disease Challenge, Genotyping and Imputation
Eight days after the start of the challenge, the average lice burden per fish across the challenged population was 38 ± 16. The average weight prior to the start of the trial was 122 ± 40 grams and 143 ± 49 grams after the challenge. All samples were genotyped using a low-density SNP panel (968 SNPs). The average r 2 (squared correlation coefficient between alleles at different loci, Hill and Robertson, 1968) between consecutive markers in the low density panel was 0.06, and the average physical distance 438 Kb. 50 samples were genotyped for less than 90% of the SNPs and therefore removed from subsequent analyses. The remaining samples were imputed to high-density from a population of 1,056 salmon that included full siblings of the challenge population, which had previously been genotyped for 45K SNPs (subset of Yáñez et al., 2016 selected as described in Correa et al., 2015). After removing SNPs showing low imputation accuracy (< 80 %), a total of 39,416 SNPs remained with an average imputation accuracy (as assessed by cross-validation) of ∼95 %. After further call rate, minimum allele frequency, heterozygosity, identity-by-descent and Hardy-Weinberg filters, 38,028 markers and 2,345 fish remained for downstream analyses.

Genetic Parameters
Heritabilities and genetic correlations of different traits related to sea lice load, growth and growth during infestation are shown in Table 1. The estimated heritability for sea lice load was 0.29 ± 0.04, and the number of sea lice attached to each fish showed positive genetic correlation with both initial weight (0.47 ± 0.07) and initial length (0.42 ± 0.08). However, sea lice density (h 2 = 0.28 ± 0.04) was independent of the size of the fish which implies that these positive genetic correlations are due to the fact that larger fish tend to have more lice. Initial weight and length showed significant heritabilities as expected, and growth during infestation also presented a moderate heritability, especially weight (h 2 = 0.24 ± 0.04). Surprisingly, weight gain during infestation showed a positive genetic correlation with sea lice density and sea lice counts, albeit with a high standard error (0.25 ± 0.12 and 0.27 ± 0.12, respectively).

Genome-Wide Association
The genetic architectures for the traits of sea lice density (Figure 1) and growth gain during infestation (Supplementary Figure S1) were studied using two different methods. The single SNP GWAS for sea lice density revealed three (imputed) SNPs reaching genome-wide significance in the distal part of chromosome 3 (Figure 1A), each estimated to explain 3.61-4.14 % of the genetic variation. Additional SNPs showed suggestive association with sea lice density in chromosome 5, chromosome 9, and chromosome 18. Regional heritability analyses using 50 SNP windows confirmed the QTL in chromosomes 3 and 18, both estimated to explain ∼7.5 % of the genetic variation in sea lice density ( Figure 1B). The RHM approach detected an additional QTL not found in the single SNP GWAS, on chromosome 21, explaining close to 10 % of the genetic variation in sea lice density. The QTL regions in chromosomes 3, 18, and 21 were further refined, adding and removing SNPs until the window explaining the most genetic variation for each QTL was found. These QTL were narrowed to regions of 3-5 Mb and were each estimated to explain between 7.8 and 13.4 % of the genetic variation in sea lice density, accounting in total for almost 30% of the genetic variance ( Table 2) assuming additive effects of the QTL.

QTL Characterization
The three sea lice density QTL regions were then further interrogated to identify and characterize potential causative genes and variants. The Atlantic salmon genome annotation (Lien et al., 2016), the results of a previous RNA-Seq study comparing liceattachment sites and healthy skin (Robledo et al., 2018a), and SNP variants obtained from the WGS of pools of fish with high Frontiers in Genetics | www.frontiersin.org Heritabilities are in bold and genetic correlations are in normal font. and low number of lice were combined to obtain a holistic view of these regions (Figure 2 and Supplementary Figures S1, S2).
The QTL regions were all relatively large, and contained a large number of SNPs and genes. The whole-genome re-sequencing data led to the identification of 16K-22K putative SNPs in each of the QTL regions, but less than a thousand were located in genic regions in each of them. Surprisingly, the number of mutations that were predicted to have a moderate or large functional effect was relatively high, especially in chromosome 3 where a total of 213 non-synonymous SNPs were detected, along with 5 premature stop, 1 stop lost and 12 start gain mutations. The equivalent numbers were more modest for chromosomes 18 and 21, with 37 and 13 non-synonymous mutations, respectively, but still relatively high to single out high priority candidate causative variants using variant effect prediction data alone. The three QTL regions had also a relatively high number of genes  (83, 36, and 11 genes in the QTL regions of chromosomes 3, 18 and 21, respectively). Therefore, to shortlist candidate genes and variants, a combination of differential expression between resistant and susceptible fish, variant effect prediction, and a literature search relating to the function of the genes and their potential role in host response to parasitic infection were used. The clearest candidate gene in chromosome 3 was arguably TOB1, where a premature stop mutation was detected. This transcription factor negatively regulates cell proliferation (Matsuda et al., 1996), including that of T-cells (Baranzini, 2014). In our study, it was highly expressed in the skin according to RNA-Seq data, and its expression was significantly lower in lice attachment regions of the skin (Robledo et al., 2018a). For chromosome 21, serine / threnonine-protein kinase 17B (STK17B) showed the highest fold change between liceattachment and healthy skin and a missense mutation; this gene has been connected to apoptosis and T-cell regulation, and T-cells of STK17B deficient mice are hypersensitive to stimulation (Honey, 2005). Previous studies comparing the immune response of resistant and susceptible salmonid species have linked Th2type responses to sea lice resistance (Braden et al., 2015), which is consistent with these two genes potentially having a functional role relating to the resistance QTL. Chromosome 18 does not contain any clear candidate genes, but from a literature search alone, the most plausible gene is probably heme binding protein 2 (HEBP2). Reducing iron availability has been suggested as a possible mechanism of resistance to sea lice (Fast et al., 2007;Sutherland et al., 2014) and Piscirickettsia salmonis (Pulgar et al., 2015) in Atlantic salmon.
A differential allelic expression analysis was performed to screen for potential cis-acting regulatory mutations affecting genes in the QTL regions. An uncharacterized gene (XP_014049605.1) showed clear signs of differential allelic expression (P = 0.00081, Figure 3) in chromosome 3 at 8.1 Mb, less than 200 Kb away from the significant GWAS SNPs. This gene is also highly expressed in the skin of lice-infected salmon (Robledo et al., 2018a), and similar proteins are found in other salmonid and teleost species (NCBI's non-redudant protein database).

DISCUSSION
In the current study we aimed to use a combination of GWAS, RNA-Seq, whole genome resequencing, and functional annotation approaches to identify and characterize QTL influencing host resistance to sea lice. Heritability estimates for sea lice density were similar or higher than previous studies on C. rogercresseyi-challenged Atlantic salmon. Lhorente et al. (2012Lhorente et al. ( , 2014 obtained pedigree-estimated heritabilities of 0.17-0.34, while estimates on a previously related sea lice challenged population were of 0.10-0.11 with both pedigree and molecular information (Yáñez et al., 2014;Correa et al., 2017a,b). This heritability is also in consistent with heritability estimates for salmon challenged with L. salmonis (0.2-0.3; Kolstad et al., 2005;Gjerde et al., 2011;Gharbi et al., 2015;Tsai et al., 2016), and similar to heritabilities for resistance to other ectoparasites affecting Atlantic salmon such as Gyrodactylus salaris (0.32; Salte et al., 2010) and Neoparamoeba perurans (Amoebic gill disease; 0.23-0.48; Taylor et al., 2009;Robledo et al., 2018a).
Previous studies on the architecture of resistance to C. rogercresseyi had revealed just one significant SNP in chromosome 21 ∼6.5 Mb (Correa et al., 2017b). While no significant SNPs were found in chromosome 21 in our study using single SNP GWAS, the regional heritability analysis did highlight the nearby region between 8 and 10.5 Mb as explaining over 13% of the genetic variation of the trait. Our RHM analysis also identified two additional QTL explaining a significant amount of the genetic variance, but only one of them detected in our single SNP GWAS. RHM analyses and other similar approaches use the information of several consecutive SNPs, increasing the statistical power and reliability of association mapping (Riggio and Pong-Wong, 2014;Shirali et al., 2016), which consequently should result in higher repeatability and concordance between genetic association studies. Accordingly, in our study the RHM analysis arguably located the previously detected QTL (Correa et al., 2017b), while the single SNP GWAS seemingly failed to do so.
Discovering the causal variants underlying QTL is a very challenging task, and as result very few causative variants have been identified to date. The first problem lies with the large regions that have to be investigated, since narrowing the QTL is extremely difficult due to reduced recombination and high linkage disequilibrium along large regions of the genome (e.g., Tsai et al., 2016). Further, despite the simplicity of identifying most or all variants within a region using WGS, prioritizing those variants is challenging with the current status of annotation of the Atlantic salmon genome, which has 48,775 protein coding genes and 97,546 mRNAs (Lien et al., 2016). Putative non-synonymous and even premature stop mutations appear relatively frequently, probably indicating a significant proportion of pseudogenes, and therefore hindering our ability to prioritize functional mutations. Further, in complex traits, a high proportion of causative mutations are located in regulatory elements (Keane et al., 2011;Albert and Kruglyak, 2015), which are difficult to evaluate without comprehensive genome annotation using assays that can identify such regions. In this sense, the outputs of the Functional Annotation of All Salmonid Genomes (FAASG; Macqueen et al., 2017) initiative should contribute to prioritization of intergenic SNPs through the characterization of functional regulatory elements in salmonid genomes. Complementary, differential allelic expression (DAE) and expression QTL (eQTL) can be an effective route to identify functional candidates (Gamazon et al., 2018). The caveat of DAE and eQTL is that gene expression is quite commonly restricted to specific tissues. A GTex-like project (GTEx Consortium, 2017) in salmonids could also facilitate the discovery of functional variants underlying QTL.
Despite these limitations, two genes were identified that are strong candidates for the QTLs in chromosome 3 and 21, TOB1 and STK17B, respectively. Coho salmon, a salmonid species considered resistant to sea lice, shows pronounced epithelial hyperplasia and cellular infiltration two days after sea lice attachment, and wound healing combined with a strong Th2 immune response has been suggested as the mechanism of resistance (Braden et al., 2015). TOB1 and STK17B have been previously associated with cell proliferation and T cell regulation. TOB1 is an antiproliferative protein which is ubiquitously expressed in several species (Baranzini, 2014), and inhibits T cell proliferation in humans (Tzachanis et al., 2001). TOB1 downregulation in response to sea lice attachment suggests that this gene plays a relevant role in the Atlantic salmon response to the parasite, and the detected putative premature stop codon mutation may be concordant with faster wound healing and T cell proliferation. STK17B, also known as DRAK2, has also been connected to T cell function (Honey, 2005;Gatzka et al., 2009) and to proliferation in cancer (Yang et al., 2012;Lan et al., 2018). STK17B contains a non-synonymous mutation and marked up-regulation in response to sea lice in salmon skin. In addition to these two strong functional candidates, the allelic differential expression analysis also revealed an uncharacterized protein regulated by cis-polymorphisms in the QTL region in chromosome 3. These three genes and their mutations deserve further attention in follow-up studies aimed to increase resistance to sea lice in Atlantic salmon. Such follow up studies could include further functional annotation of QTL regions using chromatin accessibility assays to identify genomic regions potentially impacting on the binding of transcription factors or enhancers. Genome editing approaches (such as CRISPR-Cas9) could be applied to test hypotheses relating to modification of gene function or expression caused by coding or putative cis-acting regulatory variants in cell culture, or ultimately to perform targeted perturbation of the QTL regions and assess the consequences on host resistance to sea lice in vivo.

CONCLUSION
Host resistance to sea lice in this Chilean commercial population is moderately heritable (h 2 = 0.28) and shows a polygenic architecture, albeit with at least three QTL of moderate effect on chromosomes 3, 18, and 21 (7.8 to 13.4% of the genetic variation). Growth during infestation also has a significant genetic component (h 2 = 0.24), and its genetic architecture is clearly polygenic, with QTL of small effect distributed along many genomic regions. The three QTL affecting lice density were further investigated by integrating RNA-Seq and WGS data, together with a literature search. A putative premature stop codon within TOB1, an anti-proliferative protein, seems a plausible candidate to explain the QTL in chromosome 3. Alternatively, an uncharacterized protein on the same QTL region displayed differential allelic expression, and which may form a suitable target for further functional studies. STK17B, functionally connected to proliferation and T cell function, is a plausible candidate for the QTL in chromosome 21. It is evident that even when all variants in a QTL region are discovered, that shortlisting and prioritizing the potential causative variants underlying QTL is challenging. However, the impending availability of more complete functional genome annotation and eQTL data is likely to assist this process, thereby helping to elucidate the functional genetic basis of complex traits in aquaculture species.

DATA AVAILABILITY STATEMENT
The RNA-Seq raw reads have been deposited in NCBI's Sequence Read Archive (SRA) under Accession No. SRP100978, and the results have been published in Robledo et al. (2018b)

ETHICS STATEMENT
The lice challenge experiments were performed under local and national regulatory systems and were approved by the Animal Bioethics Committee of the Faculty of Veterinary and Animal Sciences of the University of Chile (Santiago, Chile), Certificate No 01-2016, which based its decision on the Council for International Organizations of Medical Sciences (CIOMS) standards, in accordance with the Chilean standard NCh-324-2011.

AUTHOR CONTRIBUTIONS
RH, JY, and DR were responsible for the concept and design of this work and drafted the manuscript. JP was responsible for the disease challenge experiment. AB managed the collection of the samples. AG performed the molecular biology experiments. DR performed bioinformatic and statistical analyses. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
We would like to thank the contribution of Benchmark Genetics Chile and Salmones Chaicas for providing the biological material and phenotypic records of the experimental challenges. We would also like to thank Benchmark Genetics Chile for providing high-density genotypes of full-sibs of our experimental population for imputation. JY would like to acknowledge the support from Núcleo Milenio INVASAL from Iniciativa Científica Milenio (Ministerio de Economía, Fomento y Turismo, Gobierno de Chile).

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2019. 00056/full#supplementary-material FIGURE S1 | QTL region in chromosome 3. Bars represent the log 2 fold change between healthy and sea lice attachment skin for every gene in the QTL region according to the RNA-seq. Bar colour represents the expression level of the gene (lighter = less expressed), and the annotation of the gene is presented in a label on the top of the graph. Genic SNPs detected by WGS are shown in between, those with putatively more severe effects are located towards the top of the figure.
FIGURE S2 | QTL region in chromosome 18. Bars represent the log2 fold change between healthy and sea lice attachment skin for every gene in the QTL region according to the RNA-seq. Bar colour represents the expression level of the gene (lighter = less expressed), and the annotation of the gene is presented in a label on the top of the graph. Genic SNPs detected by WGS are shown in between, those with putatively more severe effects are located towards the top of the figure.
TABLE S1 | Markers in the low density SNP panel.