Genome-Wide Association Study for Identifying Loci that Affect Fillet Yield, Carcass, and Body Weight Traits in Rainbow Trout (Oncorhynchus mykiss)

Fillet yield (FY, %) is an economically-important trait in rainbow trout aquaculture that affects production efficiency. Despite that, FY has received little attention in breeding programs because it is difficult to measure on a large number of fish and cannot be directly measured on breeding candidates. The recent development of a high-density SNP array for rainbow trout has provided the needed tool for studying the underlying genetic architecture of this trait. A genome-wide association study (GWAS) was conducted for FY, body weight at 10 (BW10) and 13 (BW13) months post-hatching, head-off carcass weight (CAR), and fillet weight (FW) in a pedigreed rainbow trout population selectively bred for improved growth performance. The GWAS analysis was performed using the weighted single-step GBLUP method (wssGWAS). Phenotypic records of 1447 fish (1.5 kg at harvest) from 299 full-sib families in three successive generations, of which 875 fish from 196 full-sib families were genotyped, were used in the GWAS analysis. A total of 38,107 polymorphic SNPs were analyzed in a univariate model with hatch year and harvest group as fixed effects, harvest weight as a continuous covariate, and animal and common environment as random effects. A new linkage map was developed to create windows of 20 adjacent SNPs for use in the GWAS. The two windows with largest effect for FY and FW were located on chromosome Omy9 and explained only 1.0–1.5% of genetic variance, thus suggesting a polygenic architecture affected by multiple loci with small effects in this population. One window on Omy5 explained 1.4 and 1.0% of the genetic variance for BW10 and BW13, respectively. Three windows located on Omy27, Omy17, and Omy9 (same window detected for FY) explained 1.7, 1.7, and 1.0%, respectively, of genetic variance for CAR. Among the detected 100 SNPs, 55% were located directly in genes (intron and exons). Nucleotide sequences of intragenic SNPs were blasted to the Mus musculus genome to create a putative gene network. The network suggests that differences in the ability to maintain a proliferative and renewable population of myogenic precursor cells may affect variation in growth and fillet yield in rainbow trout.

Fillet yield (FY, %) is an economically-important trait in rainbow trout aquaculture that affects production efficiency. Despite that, FY has received little attention in breeding programs because it is difficult to measure on a large number of fish and cannot be directly measured on breeding candidates. The recent development of a high-density SNP array for rainbow trout has provided the needed tool for studying the underlying genetic architecture of this trait. A genome-wide association study (GWAS) was conducted for FY, body weight at 10 (BW10) and 13 (BW13) months post-hatching, head-off carcass weight (CAR), and fillet weight (FW) in a pedigreed rainbow trout population selectively bred for improved growth performance. The GWAS analysis was performed using the weighted single-step GBLUP method (wssGWAS). Phenotypic records of 1447 fish (1.5 kg at harvest) from 299 full-sib families in three successive generations, of which 875 fish from 196 full-sib families were genotyped, were used in the GWAS analysis. A total of 38,107 polymorphic SNPs were analyzed in a univariate model with hatch year and harvest group as fixed effects, harvest weight as a continuous covariate, and animal and common environment as random effects. A new linkage map was developed to create windows of 20 adjacent SNPs for use in the GWAS. The two windows with largest effect for FY and FW were located on chromosome Omy9 and explained only 1.0-1.5% of genetic variance, thus suggesting a polygenic architecture affected by multiple loci with small effects in this population. One window on Omy5 explained 1.4 and 1.0% of the genetic variance for BW10 and BW13, respectively. Three windows located on Omy27, Omy17, and Omy9 (same window detected for FY) explained 1.7, 1.7, and 1.0%, respectively, of genetic variance for CAR. Among the detected 100 SNPs, 55% were located directly in genes (intron and exons). Nucleotide

INTRODUCTION
Fillet yield (FY, %), the ratio between the separable muscle of the fish and the harvest weight, is an economically important trait in rainbow trout aquaculture that reflects production efficiency with a consequent large impact on the returns (Kause et al., 2002). On a per-unit basis, the fillet price is three times that of the whole fish (Bugeon et al., 2010) and a recent global survey ranked FY among the six most important traits for genetic improvement in rainbow trout aquaculture (Sae-Lim et al., 2012). Despite that, FY has not received much attention in breeding programs because it is a lethally-measured trait, costly to measure, and difficult to select on, thus limiting the genetic progress in traditional selective breeding programs (Kause et al., 2007).
To overcome these phenotyping constraints and to increase the accuracy of selection, the use of high-density single nucleotide polymorphism (SNP) genotyping platforms and novel statistical models can identify SNPs associated with the trait for use in genomic selection (Dekkers, 2012). Several genome-wide association studies (GWAS) that match genetic variants, with or without pedigree records, with the observed phenotype have identified significant loci associated with economically important traits in livestock (Kadarmideen, 2014;Sharma et al., 2015). The aquaculture research community is following this trend, and interesting examples are available for growth and disease resistance traits in Atlantic Salmon (e.g., Correa et al., 2015;Gutierrez et al., 2015), rainbow trout (e.g., Kocmarek et al., 2015;Liu et al., 2015b;Palti et al., 2015b), and catfish (e.g., Geng et al., 2015).
Single-step genomic best linear unbiased prediction (ssGBLUP) and Bayesian variable selection methods are available for genomic predictions using dense SNP arrays. The ssGBLUP method (Misztal et al., 2009;Christensen and Lund, 2010) jointly incorporates all available genotypic, phenotypic, and pedigree data in the genomic predictions, but the method's assumption of an infinitesimal model is violated for traits affected by major genes. In contrast, Bayesian variable selection methods assume a prior distribution of the variance associated with each locus (Meuwissen et al., 2001), but are limited in that only data from genotyped animals are used in the analyses. Despite the limitations on data, the Bayesian variable selection method has resulted in greater accuracy of predicted genetic merit compared to ssGBLUP for traits known to have QTL segregating with moderate or large effect like bacterial cold water disease resistance in rainbow trout (Vallejo et al., submitted manuscript). The ssGBLUP method has been extended to a weighted ssGBLUP (wssGBLUP) that emulates the Bayesian variable selection method by allowing for unequal variances across loci (Wang et al., 2012).
In rainbow trout, the recent development of the 57K SNP Axiom R Trout Genotyping Array ; Affymetrix, Santa Clara, CA), a new genetic linkage map (published here) developed using 47,939 markers included in the 57K SNP Axiom R Trout Genotyping Array, and the release of a reference genome (Berthelot et al., 2014) have provided tools for studying FY using GWAS. The objectives of this study were to conduct a GWAS for FY and other carcass and body weight traits in a rainbow trout population developed at the National Center for Cool and Cold Water Aquaculture (NCCCWA) and selectively bred for improved growth performance, and to visualize gene networks and functional categories enriched among the putative genes detected by GWAS for the studied traits.

Ethics Statement
The National Center for Cool and Cold Water Aquaculture (NCCCWA) Institutional Animal Care and Use Committee (Leetown, WV) reviewed and approved all experimental procedures used in this study (Protocol #056).

NCCCWA Population
The development, husbandry practices, and selective breeding of the resource population used in this study have been previously described (Leeds et al., 2016). Briefly, the population was initially developed by intercrossing 7 domesticated strains of rainbow trout. The population was closed to outside germplasm in 2004, and has since been selectively bred each generation for improved growth performance using an index of 10-month body weight and thermal growth coefficient for the growth period between 10 and 13 months of age. Full-sib families characterized for harvest traits in the current study were from the third (2010 year class; 98 families), fourth (2012 year class; 99 families), and fifth (2014 year class; 102 families) generations of selection. The population was converted to all-female families beginning in 2010 by using masculinized females as sires, thus the proportion of males was only 11% in 2010, 4% in 2012, and 0.6% in 2014. All data used in the current study were from non-masculinized fish. All families were maintained in separate tanks to retain pedigree information until tagging with a passive integrated transponder (Avid Identifiction Systems Inc., Norco, CA) at 4-5 months posthatch. An average of 15.6 fish per family was tagged in 2010, 2012, and 2014 (range = 8-17 fish per family). After tagging, fish from all families were commingled in replicate grow-out tanks. All fish were fed a standard commercial diet (Ziegler Bros, Inc., Gardners, PA) throughout the study using automated feeders (Arvotec, Huutokoski, Finland).

Traits
Individual body weight data at 10 months (BW10; 295 d ± 18.8) and 13 months (BW13; 386 d ± 12.9) post-hatch were recorded as part of the selective breeding program (Leeds et al., 2016) and were available from 2002 to 2014 (Table 1).
Five fish from each 2010, 2012, and 2014 year class full-sib family were identified for characterization of FY. The aim was to sample fish that represented the range of body weights within each family, with the exception that the largest and smallest fish from each family were excluded. Most families had 15 fish available for sampling at 13 months of age (range = 8-17 fish per family). Thus, to identify fish for FY characterization, the dataset was sorted by family and in descending order of body weight and every 2nd or 3rd fish was selected so that the distribution of body weights was uniformly centered around the median of the family. Selected fish were assigned to one of five harvest groups each generation (∼100 fish per harvest group) with the aim of having one fish per family per harvest group. One harvest group per week was processed in each of five consecutive weeks, with the exception that year class 2012 fish were harvested over a 6-week period due to scheduling conflicts. Fish were taken off feed 5 days before harvesting. Fish were harvested between 410 and 437 days post-hatch (mean body weight = 985 g; SD = 239 g), between 446 and 481 days post-hatch (mean body weight = 1803 g; SD = 305 g), and between 407 and 435 days post-hatch (mean body weight = 1617 g; SD = 255 g) for the 2010, 2012, and 2014 hatch years, respectively. At harvest, fish were euthanized using a lethal dose of tricaine methanesulfonate (Tricaine-S, Western Chemical, Ferndale, WA), weighed, eviscerated, and placed on ice overnight. The next day, carcasses were beheaded, weighed, and hand-filleted by a single, experienced technician. The same technician filleted all fish from the 2010 and 2012 year class families, and a different technician filleted all fish from the 2014 year class families. Fillet weight was recorded as the sum of both fillets for each fish; fillet weights excluded the skin in 2010 and 2012 year class families but included skin in 2014 year class families. A summary of the records available, mean, standard deviation and coefficient of variation for each trait is presented in Table 1.

Genetic Linkage Map
As the current rainbow trout refrence genome (Berthelot et al., 2014) is fragmented into sequence scaffolds and true chromosome sequences are not yet available as a reference for genetic analyses like GWAS, we generated a new dense linkage map which was used as a genetic map reference in this study ( Table S1). The 57K SNP Axiom R Trout Genotyping Array  was used to genotype (GeneSeek, Inc., Lincoln, NE) 2464 samples collected across 46 full-sib families from a commercial Norwegian population and 10 full-sib families from the NCCCWA odd-year breeding population. Following quality control of raw genotype data as previously described , linkage mapping was performed with Lep-MAP software (Rastas et al., 2013). First, SNPs were assigned to linkage groups with the "SeparateChromosomes" command using increasing LOD thresholds until the observed number of linkage groups corresponded with the haploid chromosome number in this species. Additional SNPs were subsequently added to the groups with the "JoinSingles" command at a more relaxed LOD threshold, and finally SNPs were ordered in each linkage group with the "OrderMarkers" command. Numerous iterations were performed to optimize error and recombination parameters. A total of 47,839 SNPs were mapped to 29 linkage groups, with an average of 1650 SNPs per group. The number of SNPs assigned to each group ranged from 754 to 2934. The total distances covered by the male and female maps were 2214 cM and 4248 cM, respectively. In all 13 chromosomes, known to have homologous pairing with at least one other chromosome arm, female/male recombination ratios were >2.0; whereas, in non-duplicated chromosomes, the female/male recombination ratio ranged from 1.0 to 2.0, with the exception of chromosomes Omy15 and Omy21.

Genotyping for FY GWAS
The 57K SNP Axiom R Trout Genotyping Array  was used to genotype 941 fish with FY phenotypes from the 2010 and 2012 year classes (197 full-sib families) and 392 direct ancestors of these fish back to the grandparents of the 2010 year class families. Quality control (QC) of the genotype data was performed to exclude fish with genotype call rates <0.95. Of the 941 fish with FY phenotypes and genotypes and from the 392 direct ancestors with only genotypes, were retained 875 and 391, respectively (1266 fish) that passed QC. Genotypic data from the 391 direct ancestors were used to confirm accuracy of pedigrees for the 875 phenotyped and genotyped fish used in the GWAS.
From the initial 42,488 SNPs that were polymorphic in this population, we retained those with call rates greater than 0.90, minor allele frequencies greater than 0.05, and no departures from Hardy-Weinberg equilibrium (SNPs were excluded when the difference between observed and expected genotype frequencies was >0.15). A total of 38,107 effective SNPs passed QC filtering and were used in the GWAS.

Genome-Wide Association Study (GWAS)
GWAS was carried out using the weighted single-step GBLUP approach (wssGWAS, Wang et al., 2012). This method combines phenotype, genotype, and pedigree data in a joint analysis and is implemented in the BLUPF90 software (Misztal et al., 2002;Aguilar et al., 2010;Wang et al., 2012). The wssGWAS avoids spurious solutions of SNPs, uses phenotypes from nongenotyped individuals included in the pedigree, and allows multi-trait analyses (Fragomeni et al., 2014;Wang et al., 2014). However, multi-trait analysis was not performed in this study due to convergence issues associated with the balance of the information. Thus, a single trait analysis was performed using the following model: where y is the vector of the phenotypes, b is the vector of fixed effects including hatch year (all traits), harvest group (except BW10 and BW13), and contemporary group (only for BW10 and BW13), a is the vector of additive genetic effects or SNP effects, w is the vector of the common environment effect, and e is the residual error. Two co-variables were included: posthatch age for BW10 and BW13 and harvest weight for CAR, FW, and FY. X, Z 1 , and Z 2 are incidence matrices relating a record to fixed effects in b and random animal and common environment effects in a and w, respectively. The genomic (G) relationship matrix was created according to VanRaden (2008) and combined with the numerator (A) relationship matrix into a realized (H) relationship matrix to estimate additive genetic relationships among all individuals (Aguilar et al., 2010). The number of phenotypic records used in each single trait analysis is given in Table 1, and all analyses were conducted using complete pedigree data (17,706 fish) dating back to initial development of the population (8 generations). A total of 875 fish from 196 fullsib families used in the GWAS analyses had genotypic data from the SNP array and phenotypic data for all traits in Table 1. The ssGWAS2 scenario described by Wang et al. (2014) was used to estimate genomic breeding values and iteratively estimate and weight SNP effects. Windows of 20 adjacent SNPs based on the new genetic linkage map were created using POSTGSF90 . Creation of windows of consecutive SNPs can capture the effect of the quantitative trait loci (QTL) better than a single SNP  due to signal concentration (Sun et al., 2011). Similar to previous reports (Dikmen et al., 2013;Fragomeni et al., 2014;Wang et al., 2014;Zhang et al., 2016), we found that the use of windows with a smaller number of SNPs resulted in a decreased signal-to-noise ratio compared to windows with 20 or more SNPs (data not shown). Therefore, exclusive windows (non-overlapping) of 20 consecutive SNPs were used in the GWAS analyses and the Manhattan plots with GWAS results were created using the R package qqman (Turner, 2014). Due to the quantitative nature of the traits used in this study, five iterations (wssGWAS) were implemented to reduce the background noise of SNP windows and differentiate the windows accounting for the largest proportion of variance. However, the differences in the results beyond 3 iterations were minimal (results not presented). Rather than using Pvalues from classical hypothesis tests to declare regions as significantly associated with the trait (Dikmen et al., 2013), here we identified genomic regions (windows) that explained the highest proportion (around 1%) of genetic variances .

Putative Genes Network
Interpretation of the GWAS results was facilitated using network reconstruction and visualization. A network was created using SNPs in windows that explain the highest proportion of the variance for a particular trait. Selected SNPs were located in the rainbow trout genome (Berthelot et al., 2014) using the Golden Helix Genome Browser v2.1.0 (Golden Helix Inc.). When a SNP was mapped to a gene (exon or intron), the nucleotide sequence was blasted to the Mus musculus genome using NCBI tools to predict the orthologous genes. The mouse genome was selected because it is well annotated, is available in several bioinformatics tools, and has been used as a model organism for many years (Guenet, 2005). With the list of the genes, a network was visualized with Cytoscape (Shannon et al., 2003;Serão et al., 2013) using the BisoGenet plug-in (Martin et al., 2010;Gonzalez-Pena et al., 2016). The BisoGenet plug-in uses empirical and predicted DNA-DNA, DNA-protein, and proteinprotein interactions to reconstruct and visualize likely mouse gene networks. Only gene-gene interactions were represented and only one neighbor connecting the genes was admitted. The edges denote known relationships between genes from several databases summarized in the SysBiomics repository that integrates data from NCBI, UniProt, KEGG, and GO databases using the k-nearest neighbor model. The node color denotes if the gene was identified (green) directly from the GWAS or if it is a connecting neighbor (pink) based on annotation of the mouse genome.
A functional analysis was attempted using the Database for Annotation, Visualization and Integrated Discovery (DAVID 6.7; Huang et al., 2009a,b) with the genes detected by the GWAS. However, enriched gene ontology, biological processes, molecular functions, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were not identified with a relevant false discovery rate. We assume that the polygenic nature of the traits was a major contributor to our inability to identify functional candidate genes near the QTL detected in the GWAS. In addition, the overall number of protein coding open reading frames we were able to find near the QTL was relatively small due to the small genome scaffolds and fragmented reference genome that is currently available for rainbow trout (Berthelot et al., 2014), which did not allow for evaluating genes on neighboring genome sequence scaffolds.
The correlations between estimated breeding values (EBVs) and genomic breeding values (GEBVs) were near unity (0.99) for traits measured directly on breeding candidates (BW10 and BW13) and smaller (0.72-0.79) for lethally-measured traits (CAR, FW, and FY; Table 2). In the absence of progeny performance data or a correlated trait that can be measured directly on breeding candidates, traditional BLUP-based EBVs for CAR, FW, and FY are necessarily identical among nonphenotyped siblings, whereas the use of genomic information (GI) in GBLUP enables fish-specific GEBVs despite the absence of phenotypic data for the fish, its progeny, or for a correlated trait. Thus, the smaller correlations between EBVs and GEBVs for lethally-measured traits was expected because the correlation is between family-specific and fish-specific estimates of genetic merit, respectively. When GI was added to the aforementioned analysis, a slight increase in heritabilities was observed for all traits (Table 2). However, the accuracies of GEBVs were increased by ∼100% for CAR and FW (from 0.28 and 0.25 to 0.55 and 0.50, respectively) and by ∼420% for FY (from 0.13 to 0.55) compared traditional pedigree-based EBVs. For lethallymeasured traits like FW, FY, and CAR, traditional selective breeding programs rely on sib-testing with limited reliability (Odegård et al., 2014). Therefore, methods that increase accuracy of predictions and expedite genetic progress by exploiting withinfamily genetic variation for economically-important traits in aquaculture species are important for continued development of the aquaculture industry . Models that include GI from numerous SNP markers in addition to the phenotypic and pedigree information without previous knowledge of the underlying QTL outperformed models without GI (Nielsen et al., 2009;Odegård et al., 2014). This improved performance of GI models is expected based on the definition of the accuracy as a function of heritability and amount of information used (Chen et al., 2011). Examples of traits for which inclusion of GI resulted in an increase in accuracy include growth in broiler chickens ; lice resistance and fillet color in Atlantic salmon (Odegård et al., 2014); and weight and length traits in Atlantic salmon (Tsai et al., 2015).

GWAS
The Manhattan plots from GWAS results at iteration 5 for FY, FW, BW10, BW13, and CAR are shown in Figures 1-5, respectively. In total, 1906 non-overlapping, non-repetitive windows of 20 successive SNPs were used. Of these windows, two windows located on chromosome Omy9 explained 1.5 and 1.0% of the genetic variance for FY. The same windows explained 1.2 and 1.1% of the genetic variance for FW. Only one window, located on Omy5, was responsible for 1.38 and 0.95% of the genetic variance for BW10 and BW13, respectively. Three windows located on Omy27, Omy17, and Omy9 were responsible for 1.7, 1.7, and 1.0%, of the genetic variance in CAR, respectively.
No major QTL was detected for FY, suggesting that this trait has a polygenic architecture affected by multiple loci with small effects in this rainbow trout population. The SNP markers from the two windows that explained at least 1.0% of the proportion of variance for FY and harboring or neighboring genes from the same genome scaffold (Berthelot et al., 2014) are listed in Table 3. An extended list of all the SNPs markers is provided in Table S2. Only 13 of the 40 SNPs identified by the wssGWAS on Omy9 explained a proportion of the genetic variance equal to or greater than 0.1%. Of them, nine were located within a gene (exon or intron) and the other four had known neighboring genes in the same genome scaffold. Four of those 13 SNPs were located in scaffold_516, of which one SNP was mapped to exon 6 of Properdin, another to exon 3 of Transmembrane protein 201-like, and two SNPs were mapped in the intron regions of Calsyntenin-1-like isoform x2 (Table 3). Briefly, Properdin is one of the proteins that participates in the complement system, which also may interfere with fatty acid uptake and esterification   in adipocytes (Gauvreau et al., 2012). Transmembrane protein 201-like is a component of transmembrane actin-associated nuclear lines with a role in centrosome orientation and nuclear movement prior to cell migration (Borrego-Pinto et al., 2012). Calsyntenin-1 is associated with kinesin-1-mediated transport of vesicles and tubulovesicular organelles (Konecna et al., 2006), appears to participate in intracellular transport and endosomal trafficking, and is necessary for the formation of peripheral sensory axons (Ponomareva et al., 2014). However, the biological functions of calsyntenins are not well understood yet (Ortiz-Medina et al., 2015).
The same two windows found in the FY GWAS were also associated with FW. However, from the 40 SNPs markers identified by the wssGWAS, only 12 explained a proportion equal to or greater than 0.1% and of them, nine were in a gene (exon or intron, Table 4). An extended list of all  the SNP markers is provided in Table S3. Similar to FY, no major QTL was detected for FW which supports the polygenic architecture of FW. Three of the 12 SNPs were also located in scaffold_516 within the Calsyntenin-1-like isoform x2, Properdin and Transmembrane protein 201-like genes ( Table 4). Three additional SNPs were located in scaffold_347 in the Srclike-adapter 2; Serine/ threonine-protein kinase sbk1-like; and Phd finger protein 20-like isoform x3 genes. Briefly, Src-likeadapter 2 is an adaptor protein that regulates T and B cell maturation and development, and it is a critical component regulating signal transduction in immune and malignant cells (Sosinowski et al., 2001;Dragone et al., 2006;Kazi et al., 2015). Additionally, differentially-expressed transcripts in response to handling and confinement stress in rainbow trout were mapped to a serine/threonine-protein kinase, SBK1, homologous in zebrafish that participates in signal transduction pathways related to brain development (Chou et al., 2006;Liu et al., 2015a).
Genome regions associated with growth have been detected on most of the 29 chromosomes in Atlantic salmon (Baranski et al., 2010;Gutierrez et al., 2012Gutierrez et al., , 2015Tsai et al., 2015) and in rainbow trout (O'Malley et al., 2003;Perry et al., 2005;Wringe et al., 2010). The heterogeneity in the results makes it hard to compare our results to previous studies. Many factors contribute to this observed heterogeneity between studies including: (1) the highly polygenic architecture of growth and growth-related traits; (2) differences in marker segregation that may be affected by the strain genetic background, and different types and densities of markers used in each study; (3) different algorithms used in the QTL detection analyses; (4) large variation in sample size (Baranski et al., 2010;Wringe et al., 2010;Tsai et al., 2015), and (5) possible false positives. None of the 20 SNPs identified by the wssGWAS for BW10 and BW13 on chromosome Omy5 (Figures 3, 4) were able to surpass the threshold of 0.1%. The complete list of the SNPs is provided in Tables S4, S5 for BW10 and BW13, respectively. Our findings of the polygenic architecture of growth traits in fish is consistent with previous reports in the literature (Devlin et al., 2009;Dai et al., 2015;Tsai et al., 2015), and, congruent with our GWAS results, several markers were associated with weight in a GWAS for Atlantic salmon, but the proportion of variance explained by each marker was less than 0.1% (Tsai et al., 2015).
Lastly, SNP markers that explained more than 0.1% of the proportion of genetic variance for CAR on Omy27, 17, and 9, and harboring or neighboring genes from the same genome scaffold (Berthelot et al., 2014) are listed in Table 5. No major QTL was detected for this trait. An extended list of all the SNP markers Chr, chromosome; VE, percentage of the genetic variance explained by the SNP; Loc, location in the scaffold with three possibilities: intron, exon, or near an exon. Chr, chromosome; VE, percentage of the genetic variance explained by the SNP; Loc, location in the scaffold with three possibilities: intron, exon, or near an exon.
is provided in Table S6. From the 60 SNP markers identified by wssGWAS, only 18 explained a proportion equal to or greater than 0.1%, of which 10 were in a gene (exon or intron). Four of the 18 SNPs were located in scaffold_173; one was in the exon of ubiquitin-conjugating enzyme e2 variant 1, and three were near this gene and near the histone h2b 1 2-like gene. One of the SNPs was mapped to the Calsyntenin-1-like isoform x2 intron region in scaffold_516 that also affected FY and FW.

Putative Genes
Networks of genes offer insight to the molecular relationships among genes. The network was visualized considering genes and neighboring genes located in the same genome scaffold as the SNPs we identified in the wssGWAS in 20-SNP windows responsible for ∼1.0% or more of the total genetic variance for the analyzed traits. Genes included in the network are described in Tables S2-S6. The network includes 115 gene nodes, of which Chr, chromosome; VE, percentage of the genetic variance explained by the SNP; Loc, location in the scaffold with three possibilities: intron, exon, or near an exon.
21 nodes were genes detected by the wssGWAS analysis (green nodes) while the rest were connecting neighbors (pink nodes, Figure 6). In this network, SRY (sex determining region Y)-box 2 (Sox2), Kinase suppressor of ras 1 (Ksr1), Tripartite motifcontaining 33 (Trim33), and Nitric oxide synthase 2 inducible (Nos2) were well-connected gene nodes linking to 29, 11, 7, and 6 other gene nodes, respectively. A number of genes detected by wssGWAS analysis within windows responsible for 1.0% or greater of the trait variance have been identified in mammalian systems as significant for muscle development, particularly with respect to the maintenance of hyperplastic capacity. Rainbow trout exhibit indeterminate growth potential as they have the ability increase both body length and muscle mass throughout adulthood (Johnston et al., 2011). A physiological mechanism unique to indeterminate growers is the ability to maintain a population of myogenic precursor cells with proliferative capacity that then differentiate into myotubes or contribute to nuclear accretion (Mommsen, 2001;Froehlich et al., 2013). Therefore, the continued ability for hyperplasia and hypertrophy in skeletal muscle contributes to the indeterminate growth phenotype (Johnston et al., 2011). In this regard, SNPs affecting genes related to myogenic and cell proliferation mechanisms are expected to affect growth and fillet yield.
One gene detected by wssGWAS analysis that was the most connected node was Sox2. The Sox2 gene plays a critical role in maintenance and proliferation of pluripotent and neural progenitor stem cells (Takahashi and Yamanaka, 2006;Zhang FIGURE 6 | Network of genes and neighboring genes in the same scaffold as the SNPs identified in the wssGWAS in windows responsible for ∼1.0% or more of the total genetic variance of the analyzed traits. Green nodes denote genes and near genes detected by the wssGWAS analysis and pink nodes denote connecting neighbors. Edges denote known relationships between genes in the SysBiomics repository. Framed genes (red square) are discussed in the manuscript. and Cui, 2014) through its interaction with transforming Growth Factor β (TGFb) signaling (Gaarenstroom and Hill, 2014), although little is known about the role of Sox2 in muscle. Albeit, TGFb ligands like myostatin inhibit muscle growth (Lee et al., 2005;Phelps et al., 2013), partially through reductions in myogenic precursor cell proliferation (Garikipati and Rodgers, 2012;Seiliez et al., 2012). Trim33 is another gene detected by wssGWAS analysis that is up-regulated during muscle regeneration in mice and appears to play a role in myoblast proliferation (Mohassel et al., 2015). Therefore, similar to Sox2, Trim33 may also contribute to maintaining a proliferating population of myogenic precursor cells throughout development in the rainbow trout. Trim33 also inhibits Smad4 (Xi et al., 2011), a transcription factor activated by TGFb signaling that inhibits muscle regeneration and maintenance of myogenesis with age (Lee et al., 2005).
A third gene identified by wssGWAS analysis was Fragile X mental retardation gene 1 (Fxr1), an autosomal gene that is highly expressed in muscle (Blonden et al., 2005). Depletion of this gene during early development of the zebrafish leads to cardiomyopathy and muscular dystrophy ( Van't Padje et al., 2009). In cultured muscle cells, depletion of Fxr1 reduced myoblast abundance, suggesting an evolutionarily-conserved role for Fxr1 protein in myogenesis (Davidovic et al., 2013). A fourth gene detected by wssGWAS analysis that may affect myogenic capacity was Ksr1. This gene is a scaffold protein for the Raf/MEK/ERK kinase cascade (Ory et al., 2003). Although, a role of for Ksr1 in muscle has not been demonstrated, activation of the Raf/Mek/ERK kinase cascade promotes proliferation of myogenic cells (Knight and Kothary, 2011). A final gene that may affect myogenesis was Proviral integration site 1 (Pim1), a gene that, when overexpressed in cardiomyocytes, causes increases in regenerative capacity and increases the pool of progenitor cells (Del Re and Sadoshima, 2012), although it is unknown if this gene has significance in skeletal muscle. However, expression of Pim1 has been positively correlated with intramuscular fat content in steers (Sadkowski et al., 2014).
Another pair of genes that were detected by wssGWAS and are known to encode proteins with functional importance to muscle physiology are the cysteine proteases Calpain-1 (Capn1) and Calpain-2 (Capn2). Calpains are calcium activated cysteine proteases regulated by myogenic factors like myoD and myogenin (Dedieu et al., 2003). Calpains have a relevant role in signal transduction (Glading et al., 2001;Sato and Kawashima, 2001), cell-cycle regulation, apoptosis (Atencio et al., 2000;Patel and Lane, 2000), cell spreading and migration (Dourdin et al., 1997;Huttenlocher et al., 1997;Potter et al., 1998), and myogenesis (Barnoy et al., 2000). Calpains are also involved with myofibrillar protein disassembly and degradation, contributing to loss of the Z disk (Busch et al., 1972). Increased calpain activity in muscle occurs during periods of muscle reorganization and restructuring (Dedieu et al., 2004), such as during weight loss and rapid growth (Salem et al., 2005;Johnston et al., 2011;Salmerón et al., 2015), therefore it is feasible that SNPs affecting calpainrelated proteolysis contributes to differences in muscle growth capacity. Calpain-induced protein degradation is also associated with post-mortem proteolysis so genetic variation in the calpain system may result in differences in fillet quality (Koohmaraie, 1992;Delbarre-Ladrat et al., 2006).

CONCLUSIONS
The use of genomic information from the whole-genome association analyses in this study increased the heritability and accuracy of estimated breeding values for FY, suggesting that genomic selection will be suitable for exploiting within-family genetic variation and thus obtaining faster progress in selective breeding for this trait. Only few windows were able to explain more than 1% of the genetic variance of FY, FW, BW10, BW13, and CAR, thus corroborating the polygenic nature of these traits. Network visualization of the putative genes implicated in the GWAS analyses indicated that differences in the ability of the fish to maintain a proliferative and renewable population of myogenic precursor cells might be affecting the observed phenotypic and genetic variance in growth rate and fillet yield in rainbow trout.

AUTHOR CONTRIBUTIONS
YP, TL conceived and planned the study; PK, TL were responsible for phenotyping; GG, YP were responsible for high-density SNP genotyping and quality control of genotypic data; MB, TM, GG, and YP developed the linkage map; DG, RV were responsible for planning and conducting the GWAS analyses; BC, DG developed and interpreted gene networks; and DG drafted the manuscript. All authors read and approved the final manuscript. Ma, Meghan Manor, Johnni-Ann Sims, and Susan Slider (West Virginia University) for phenotyping and sample collection. The use of trade, firm, or corporation names in this publication is for the information and convenience of the reader. Such use does not constitute an official endorsement or approval by the USDA or the ARS of any product or service to the exclusion of others that may be suitable. USDA is an equal opportunity provider and employer.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fgene. 2016.00203/full#supplementary-material Table S1 | The rainbow trout genetic linkage map.
Table S2 | The 40 SNP markers from the two windows that explained the largest proportion of variance for FY and harboring or neighboring genes from the same genome scaffold (Berthelot et al., 2014). Table S3 | The 40 SNP markers from the two windows that explained the largest proportion of variance for FW and harboring or neighboring genes from the same genome scaffold (Berthelot et al., 2014). Table S4 | The 20 SNP markers from the one window that explained the largest proportion of variance for BW10 and harboring or neighboring genes from the same genome scaffold (Berthelot et al., 2014). Table S5 | The 20 SNP markers from the one window that explained the largest proportion of variance for BW13 and harboring or neighboring genes from the same genome scaffold (Berthelot et al., 2014). Table S6 | The 60 SNP markers from the three windows that explained the largest proportion of variance for CAR and harboring or neighboring genes from the same genome scaffold (Berthelot et al., 2014).