Combined QTL and Genome Scan Analyses With the Help of 2b-RAD Identify Growth-Associated Genetic Markers in a New Fast-Growing Carp Strain.

Common carp is one of the oldest and most popular cultured freshwater fish species both globally and in China. In a previous study, we used a carp strain with a long breeding tradition in China, named Huanghe, to create a new fast-growing strain by selection for fast growth for 6 years. The growth performance at 8 months of age has been improved by 20.84%. To achieve this, we combined the best linear unbiased prediction with marker-assisted selection techniques. Recent progress in genome-wide association studies and genomic selection in livestock breeding inspired common carp breeders to consider genome-based breeding approaches. In this study, we developed a 2b-RAD sequence assay as a means of investigating the quantitative trait loci in common carp. A total of 4,953,017,786 clean reads were generated for 250 specimens (average reads/specimen = 19,812,071) with BsaXI Restriction Enzyme. From these, 56,663 SNPs were identified, covering 50 chromosomes and 3,377 scaffolds. Principal component analysis indicated that selection and control groups are relatively clearly distinct. Top 1% of Fst values was selected as the threshold signature of artificial selection. Among the 244 identified loci, genes associated with sex-related factors and nutritional metabolism (especially fat metabolism) were annotated. Eighteen QTL were associated with growth parameters. Body length at 3 months of age and body weight (both at 3 and 8 months) were controlled by polygenic effects, but body size (length, depth, width) at 8 months of age was controlled mainly by several loci with major effects. Importantly, a single shared QTL (IGF2 gene) partially controlled the body length, depth, and width. By merging the above results, we concluded that mainly the genes related to neural pathways, sex and fatty acid metabolism contributed to the improved growth performance of the new Huanghe carp strain. These findings are one of the first investigations into the potential use of genomic selection in the breeding of common carp. Moreover, our results show that combining the Fst, QTL mapping and CRISPR-Cas9 methods can be an effective way to identify important novel candidate molecular markers in economic breeding programs.

Common carp is one of the oldest and most popular cultured freshwater fish species both globally and in China. In a previous study, we used a carp strain with a long breeding tradition in China, named Huanghe, to create a new fast-growing strain by selection for fast growth for 6 years. The growth performance at 8 months of age has been improved by 20.84%. To achieve this, we combined the best linear unbiased prediction with marker-assisted selection techniques. Recent progress in genome-wide association studies and genomic selection in livestock breeding inspired common carp breeders to consider genome-based breeding approaches. In this study, we developed a 2b-RAD sequence assay as a means of investigating the quantitative trait loci in common carp. A total of 4,953,017,786 clean reads were generated for 250 specimens (average reads/specimen = 19,812,071) with BsaXI Restriction Enzyme. From these, 56,663 SNPs were identified, covering 50 chromosomes and 3,377 scaffolds. Principal component analysis indicated that selection and control groups are relatively clearly distinct. Top 1% of Fst values was selected as the threshold signature of artificial selection. Among the 244 identified loci, genes associated with sex-related factors and nutritional metabolism (especially fat metabolism) were annotated. Eighteen QTL were associated with growth parameters. Body length at 3 months of age and body weight (both at 3 and 8 months) were controlled by polygenic effects, but body size (length, depth, width) at 8 months of age was controlled mainly by several loci with major effects. Importantly, a single shared QTL (IGF2 gene) partially controlled the body length, depth, and width. By merging the above results, we concluded that mainly the genes related to neural pathways, sex and fatty acid metabolism contributed to the improved growth performance of the new Huanghe carp strain. These findings are one of the first

INTRODUCTION
With the development of high throughput genotyping technology, the cost of genome-wide genetic markers decreased sharply, facilitating genomic selection (GS) and prediction of complex, quantitative traits (Meuwissen et al., 2001) After the first large-scale implementation of GS in dairy cattle breeding (Schaeffer, 2006), this technique has also been applied in other cultured species, such as pig (Knol et al., 2016), lupin (Yang et al., 2015), wheat (Charmet and Storlie, 2012), and Atlantic salmon Correa et al., 2015;Tsai et al., 2015;Bangera et al., 2017). Common carp Cyprinus carpio L. is among the handful of oldest and most important cultured fish species, both globally (FAO, 2015) and in China, where it was ranked third among the cultured freshwater fish species in 2016, with the annual production of 349,800 tones (FFABMA, 2017). As a result of centuries-long breeding tradition, a strain of common carp named Huanghe is deeply rooted in Chinese culture, and remains popular among consumers for its cultural and historical relevance, although its growth performance is objectively somewhat worse than that of other modern strains (Zhang et al., 2013). Using this strain as the base, recently we established a new fast-growing strain , here provisionally named Xinlong strain. The strain was selected for growth performance for 6 years using the best linear unbiased prediction (BLUP) and marker-assisted technology, resulting in a 20.84% improved growth rate of the F3 generation at 8 months of age. However, the genetics underlying this phenotypic difference remains unknown. Among these two breeding methods, BLUP can provide a higher genetic gain under controlled inbreeding rate and make full use of systematic effects (e.g., batch, sex, production environment, age variation, etc.), which can be estimated in a fitted mixed model for traits of interest (Ponzoni et al., 2010;Lind et al., 2012). As a result, BLUP is widely used in fish breeding (Lind et al., 2012). Xu et al. published a draft genome of the domesticated common carp (strain Songpu), developed the first highthroughput array for 250,000 SNPs, and evaluated its performance using samples from various carp strains (Xu et al., 2014a). However, identification of candidate genes or loci for GS with a large number of SNPs remained costly (Xu et al., 2014b). To address this issue, restriction site-associated DNA (RAD) tag sequencing was developed, in which only the stretches of DNA adjacent to recognition sites of a chosen restriction endonuclease are resequenced (Davey et al., 2011). In order to further reduce the cost for parallel genotyping of a large numbers of samples, a streamlined restriction site-associated DNA genotyping method named 2b-RAD (which uses type IIB restriction enzymes) was developed (Wang et al., 2012). Besides, protocols for library preparation were simplified and genome coverage regulation made more flexible (Wang et al., 2012). As this method combines the advantages of low-cost genome-wide genotyping and uniform distribution of markers, 2b-RAD has been utilized in different species for genotyping and association analyses (Seetharam and Stuart, 2013;Pecoraro et al., 2016).
Historically, two approaches have been applied to explore the candidate molecular markers significantly contributing to growth traits: fixation index (Fst) estimation based on the genome-wide SNPs information, and quantitative trait loci (QTL). In the past decade, QTL were determined for a number of economically important traits in common carp. For growth traits, Laghari et al. (2014) identified six QTL associated with body weight (BW), body length (BL), and body thickness (BT), and explained phenotypic variance in the range of 17.0-32.1%. In order to determine whether identified QTL can be applied across different common carp strains, Gu et al. identified three QTL for body weight, conserved both in mirror carp and Jian carp strains (Gu et al., 2015). Although conserved QTL were found, the specific pathways regulated by those QTL remained unexplored. Zheng et al. relied on the reference genome data  to explore a QTL located in a significantly upregulated actin cytoskeleton pathway on the chromosome LG45 using 250 microsatellite markers. This pathway plays a major role in the development of neurons and in structural changes of adult neurons (Luo, 2002). Although this approach can be used to successfully annotate the candidate QTL, a large number of molecular markers needs to be studied. High-density array method was traditionally used to capture a large number of genome-wide SNPs associated with different QTL (McCue et al., 2012;Houston et al., 2014). For example, in common carp, Zheng et al. detected 18 QTL putatively associated with four different muscle-and body fat-related traits . However, high-density array remains too expensive for the practical common carp breeding. Therefore, in this study, we took advantage of our newly-established fast-growing carp strain to attempt to apply the cheaper 2b-RAD method to identify, annotate and functionally verify growth-associated genome-wide SNPs.
The other traditionally used method to explore candidate molecular markers useful for breeding programs is the estimation of fixation index (Fst) on the basis of genome-wide SNP information. Al Abri et al. identified a correlation between the locus of ANKRD1 gene and withers height in horses under a dominant mode of inheritance using genome-wide Fst estimation and cross-population composite likelihood ratio test (Al Abri et al., 2017). Yang et al. calculated pairwise Fst values between Chinese indigenous and commercial pig breeds and found evidence of positive selection on the insulin-like growth factor 1 receptor gene (Yang et al., 2014a). Therefore, Fst-based approach can be used to examine the candidate genes for the selective breeding. On the basis of these findings, we designed our study of candidate loci for increased growth performance of the new Xinlong carp strain to include a large number of families. As 208 SNPs with significantly elevated Fst values were identified previously between strains of sheep susceptible and resistant to ivermectin in Haemonchus contortus using 2b-RAD technology , we hypothesized that cost-effective 2b-RAD technology can be used to explore the SNPs behind the faster growth of the new Xinlong strain.
Although both QTL and Fst estimation are effective methods for gene function discovery, they also have shortcomings (Korte and Farlow, 2013). Whereas, QTL mapping within full-sibling families provides low resolution, but high statistical power to identify regions of the genome that cosegregate with a given trait either in F2 populations or in Recombinant Inbred Line (RIL) families, Fst analysis provides good resolution, but its QTL detection power is influenced by the frequency of alleles and it is sensitive to the population structure. As this can result in false positives, the method requires a lot of verification work. Therefore, the objective of this study was to rely on genome-wide SNPs to combine QTL and Fst analyses to explore growth-related functional genes and loci in the new Xinlong strain of common carp. To achieve this, we compared the fast-growing new strain with the slow-growing control population of Huanghe carp. This study aims to use this approach not only to help breeders improve the growth performance of cultured fish (and other animal) species, but also attempts to address the much broader question of the genetic control of growth in animals in general.

Fish
A total of 250 common carp specimens belonging to 25 fullsib families (10 specimens/family: 8 offspring + 2 parents) were selected according to their breeding values predicted using BLUP (Ponzoni et al., 2005) from the Nanquan farm of the Freshwater Fisheries Research Center, Chinese Academy of Fishery Sciences. The population included 19 "selection" families (the newly selected Xinlong strain) and 6 "control" families (traditional Huanghe strain). These families were generated by artificially stimulating female brooders for spawning with hormonal injections, and then manually mixing the roe with milt from the matched male brooder. The fertilized eggs were then transferred to labeled hatching hapas (cage settle nets), and a week later larvae were transferred to separate nursery hapas. Each family was assigned a separate hapas, a unique ID, and spawning times were recorded.
Fine-mesh hapas were set up in one row within a concrete tank during both stages (fertilization and nursery). The size of the hapas was 120 cm (length) × 80 cm (width) × 100 cm (depth). Tanks were supplied with filtered well water and equipped with air stones for continuous aeration. Throughout the experiment, the dissolved oxygen ranged from 3.8 to 8.6 mg/L, pH from 7.2 to 8.5, and temperature from 18 to 20 • C (avg. = 19.40 ± 0.02). Fish were fed ad libitum (daily about 5% of their body weight), with a commercial feed containing 30% of protein. After about 3 months, when they reached about 10 cm in length, eight fish were randomly sampled from the offspring of each family and tagged individually with passive integrated transponder tags (PIT) produced by Biomark. Their body weight and body length (abbreviated as iBwt and iBlen, respectively) were measured.
Five months later, the fish were anesthetized with clove oil (75 mg/L) (Velisek et al., 2005), and body weight, length, width and depth (abbreviated as Bwt8m, Blen8m, Bwd8m, and Bdp8m, respectively) of each individual were recorded again to analyse the growth performance. The fish were then immediately euthanized using 0.2 mg/mL tricaine methanesulfonate (Sigma, USA), dissected, and 50-100 mg samples of caudal fins were collected. The dissected caudal fin tissues were flash frozen in liquid nitrogen, transported to the laboratory, and stored at −70 • C until DNA extraction.
Animal handling and experimental procedures were carried out in accordance with the guidelines for the care and use of animals for scientific purposes set by the Institutional Animal Care and Use Committee of the Freshwater Fisheries Research Center, and approved by the animal ethics committee of Chinese Academy of Fishery Sciences.

2b-RAD Library Preparation and Sequencing
Genomic DNA was isolated from the caudal fins using the Universal Genomic DNA Extraction Kit Ver. 3.0 (Takara, Japan) according to the manufacturer's instructions. Concentration and quality of the DNA were verified by spectrophotometry (optical density reading at 260 and 280 nm) and electrophoresis on 1.0% agarose gel.
2b-RAD libraries were constructed following adaptor and primer sequences, as well as the protocol, reported by Wang et al. (2012): 500 ng of DNA per specimen was digested in a 6 mL reaction volume using 1 U BsaXI at 37 • C for 1 h, followed by enzyme heat inactivation at 65 • C for 20 min. These digestion products were linked with T4 DNA ligase at 16 • C for 3 h, with subsequent heat inactivation for 10 min at 65 • C. PCR products were purified using the SPRIselect purification kit (Beckman Coulter, Pasadena, CA, USA) and quantified through a Qubit 2.0 Fluorometer (Invitrogen). The quality of all amplicon libraries was checked on 1.8% agarose gels and verified on Agilent 2100 Bioanalyzer. Finally, pooled libraries were sequenced on an Illumina HiSeqXten platform (Illumina, San Diego, CA, USA) using 150 base pair-end sequencing. All of the 2b-RAD sequences were archived in the NCBI SRA database (SRR6241620 and SRR6262716).

Quality Filtering and Genotyping
Quality check (QC) of sequenced reads and adapter trimming were performed by running a customized script (Pecoraro et al., 2016), to obtain 27 bp-long fragments. SNP calling was performed using STACKS v1.23 (Catchen et al., 2013). For each family, individual genotypes were constructed using components of the STACKS pipeline as follows: (i) Ustacks program was employed for building loci from all QC-passed reads using the following parameters: m3, M2, and N4; (ii) Genotyping was performed using procedures described by Jiao et al (Jiao et al., 2014). Terminal 3-bp positions were excluded from each read to eliminate artifacts that might have arisen at ligation sites. Reads with no restriction sites or containing ambiguous base calls (N), long homopolymer regions (>10 bp), excessive numbers of low quality positions (>5 positions with quality of <10), were removed. The remaining trimmed, high-quality reads formed the basis for subsequent analyses (Jiao et al., 2014).

Candidate SNP Detection and Gene Annotation
The phenotypic value is decomposed as: where y is the vector of observation value; b is the vector of fixed effect including the population mean and selection line effect; c is the vector of cage effect, and c∼N(0, Iσ 2 c ), σ 2 c is the variance of the cage component; g is the vector of additive genetic effect explained by polygenes, g∼N(0, Kσ 2 g ), where K is realized genetic relationship matrix calculated from genome-wide SNP information (VanRaden, 2008) and σ 2 g is the genetic variance explained by polygenes; q the is vector of allele substitution effects of the major QTL, which are treated as fixed effects; X, S, Z, W are the corresponding design matrices for b, c, g and q. e is the vector of residuals: e∼N(0, Iσ 2 e ), where σ 2 e is the variance of random error. The overall phenotypic variance-covariance can be expressed as: The mixed model equations for the model (1) are: and λ 2 = σ 2 e σ 2 g . The estimated genomic breeding values are expressed as: GEBV = Zg + Wq, and heritability of a trait is: where σ q2 is the variance of the breeding effect explained by all major QTL. Association analysis was performed using the StepLMM method (Lin et al., 2017).
Differentiation index Fst between selection and control groups was calculated using the VCF tool v0.1.11 (Danecek et al., 2011). SNPs with strong selection signature were examined by differential Fst values on genome-wide information (threshold value was 1% Fst). Manhattan graph of genome-wide Fst with window size 100,000 was plotted using Python2.7. The principal component analysis (PCA) was conducted by plink 1.07 (Purcell et al., 2007) and GCTA 1.13 (Yang et al., 2011). We identified candidate growth-associated genes or loci using the following criteria: (i) the genes contained the above-mentioned SNP loci with signature of selection; (ii) the genes were annotated. Functional information and annotation of genes were conducted by inquiring the common carp reference genome (Xu et al., 2014b).

Generation of Mutant IGF2a and IGF2b Zebrafish
As the QTL analysis identified two adjacent IGF2 genes (IGF2a and IGF2b) as the foremost candidate locus for different growth performance of the two strains, to further explore functions of these two genes in fish, we used CRISPR-Cas9 strategy to generate IGF2a-and IGF2b-knockout mutant zebrafish lines. Two guide RNAs (gRNAs) were designed, targeting exons of zebrafish IGF2a and IGF2b (Supplementary Table S1). gRNAs were cloned into the pT7gRNA vector as described (Lin et al., 2016) (oligonucleotide sequences are given in the Supplementary Table S1). gRNAs were transcribed using the MEGAscript T7 Transcription Kit and purified using the mirVana miRNA isolation kit. Cas9 mRNA was synthesized using the mMessage mMachine Sp6 Transcription Kit (all three kits were from Thermo Fischer Scientific, AM1334, AM1560 and AM1340 respectively) and purified using the RNA cleanup protocol from the RNAeasy mini kit (Qiagen 74104).
Wild-type AB strain zebrafish were injected at the one-cell stage with ∼50 ng gRNA and ∼100 ng Cas9 RNA. These F0 fish (n = 100) were raised to maturity and genotyped using fin clipping, DNA isolation and PCR spanning the target site (genotyping primers are given in the Supplementary Table S1). PCR products were analyzed for mutations by Sanger sequencing. Mutant PCR products were cloned into the pMD18-T vector (Takara, Dalian, China) and sequenced to identify the fish carrying the target frameshift mutation. These carrier fish were then back-crossed again to the AB wild type and the resulting F1 fish (n = 80) were raised to maturity. The F1 fish were genotyped (same as F0) to identify heterozygous mutant fish followed by cloning and sequencing of the PCR products to validate the presence of the frameshift allele.

Growth Performance
The creation of the Xinlong strain (Figures 1A,B) was officially recognized after 4 generations of selection . Except for the body length gain, which was non-significantly higher in the control group, all measured growth performance indices were significantly higher (P < 0.05) in the selected group than in the control group (Table 1).
Genotypes 2b-RAD libraries were sequenced for 250 specimens (190 selected and 60 control), generating a total of 4,953,017,786 (C) Principal Component analysis: the first dimension PC1 was assigned to X axis and the second dimension PC2 was assigned to Y axis. Red triangles represent the selection group, black circles the control group.

Genetic Differentiation
To examine the genetic structure of these two groups, we conducted principal component analysis (PCA) based on genomic SNPs. Selection line showed more variation in PC1 and PC2 scores than the control line ( Figure 1C). To test whether selection and control lines exhibit marks of artificial selection pressures, genome-wide genetic diversity was measured by nucleotide diversity π (Figures 2A,B). Both lines had similar average genetic diversity: selection π = 8.15 × 10 −6 , and control π = 9.00 × 10 −6 (Figure 2A). Interquartile ranges (a measure of statistical dispersion, being equal to the difference between 75 and 25th percentiles) were 3.06 × 10 −6 -1.15 × 10 −5 in selection line and 3.95 × 10 −6 -1.24 × 10 −5 in control line. All distributions were right-skewed, with asymmetry coefficients ranging between 1.49 (selection line) and 1.53 (control line). Frequency distribution of π values within non-overlapping 2 Mb windows across the genome was similar in both lines ( Figure 2B).
In order to find out whether the genes involved in the most variable region in the selected strain are beneficial for further selection for target traits, we defined 1,158 domains with the top 50 π values in the selection line as highly variable genomic regions. Annotation revealed that genes found in these highly variable genomic regions belonged to several functional classes: fatty acid metabolism [17 beta-hydroxysteroid dehydrogenase type 8, HSD17B8 ( Table S4). Among them, 50.41% were located in intergenic regions, followed by intronic regions (37.7%) (Supplementary Figure 2). Further analyses were focused only on those loci which could be associated with specific genes. Some of them were associated with sex factors, such as 17β-Hydroxysteroid dehydrogenase 3 (HSD17B3), which is regarded as a male reproductive marker (Hoffmann et al., 2016), and SPARC-related modular calcium binding 2 (SMOC2), involved in gonad and reproductive tract differentiation (Pazin and Albrecht, 2009;Roy et al., 2013). Other were mostly related to nutrition metabolism: Ras homolog enriched in brain (RHEB), involved in skeletal myogenesis through suppression of insulin and hypothalamic neurons that regulate food intake and peripheral metabolism Yang et al., 2014b;Meng et al., 2017); Ras-related protein Rab-10 (RAB10), related with insulinstimulated glucose uptake in adipocytes and spermiogenesis (Karunanithi et al., 2014;Sano et al., 2015;Lin et al., 2017); vasoactive intestinal peptide receptor 2 (VIPR2), associated with muscle mass and force, lipolytic effects, obesity, oocyte maturation and circadian rhythms (Akesson et al., 2005;Hinkle et al., 2005;Liu et al., 2010;Zhou et al., 2011;An et al., 2012). Among the metabolism-related genes, a number of genes could be directly related to the lipid metabolism: acyl-CoA synthetase family member 2 (ACSF2), related to long-chain fatty acid metabolic processes (Pashaj et al., 2013); tenomodulin (Tnmd), related to adipocyte differentiation and beneficial visceral adipose tissue expansion (Senol-Cosar et al., 2016); and Syntrophin beta 1 (SNTB1), related to HDL cholesterol metabolism (Okuhira et al., 2005). Finally, Tran membrane channel like 7 (TMC7) and Collagen α-1 (XXIII) chain (COL23A1) were associated with milk production traits in Holstein cows (Lee et al., 2016b); and transient receptor potential cation channel subfamily M member 7 (TRPM7) was associated with osteogenic induction (Liu et al., 2015).

QTL Analysis
In order to study the differences between the results of Fst and QTL analyses, the QTL analysis was conducted as described before : a total of 18 QTL, located on 14 chromosomes, were identified for the 6 studied growth traits (Tables 1, 2). Two overlapping QTL for Bwd8m and Blen8m were located on chromosomes 21 and 48. One of these two was also associated with Bdp8m. More than two QTL were associated with Blen8m, Bwd8m, and Bdp8m, whereas iBwt, iBlen, and Bwd8m had only one QTL associated with them (Figure 3). Some growth traits were associated with a larger number of QTL with smaller individual contributions, whereas others were under the control of a smaller number of QTL with higher individual contributions ( Figure 3A). For example, Bdp8m was associated with a smaller number of SNPs and high contribution from individual chromosomes ( Figure 3A). Among all studied traits, Bwd8m exhibited the highest contribution from a single QTL (Figure 3B). In the regions where identified QTL were located, we found three classes of functional genes: nervous system-associated genes, signal regulation pathways-associated genes, and metabolismassociated genes (Supplementary Table S5). Heritability values of the six growth indices were medium or high, ranging from 0.36 for Bwt8m to 0.59 for Blen8m (Table 3). StepLMM analysis was used to analyze the genetic variance components. The results showed that genetic variance of Bwd8m and Bdp8m and Blen8m can be attributed mostly to the effects of QTL (Table 3). Bwt8m and iBlen exhibited the highest polygenic effects, whereas Blen8m had the highest genetic, error and QTL variances. Importantly for breeding programs, Bdp8m and Blen8m exhibit a combination of most of their genetic variance accounted for by QTL and high heritability.  Combining Fst and QTL Analyses QTL explored above were appended to the Manhattan map of Fst values (Figure 4A). After filtering the overlapping regions, where observed QTL were located in 95% quartile boxes within 1Mb windows, one box on the chromosome 21 was identified as a candidate locus for growth performance ( Figure 4B). Two IGF2 sequences (IGF2a and IGF2b) were annotated in this region of the common carp genome (Xu et al., 2014b). As most mammals and teleost fishes possess only one IGF2 gene copy, in order to assess the (dis)similarity of these two sequences, Su et al. cloned IGF2a and aligned it with IGF2b, and proposed homology between these paralogs in common carp and zebrafish (Su et al., 2012). We aligned the two genomic regions containing these two genes, including the upstream and downstream sequences, and found a high similarity of the two paralogs to their zebrafish orthologs ( Figure 4C). Frequencies of all SNPs in the vicinity of these two genes (upstream 5 kb and downstream 5 kb) were plotted (Figures 4D,E). In the IGF2a sequence, only A was found on the position 492,571 in the selection group, whereas only T was found on the positions 498,493 and 522,145 in the control group ( Figure 4D). In the IGF2b sequence, G frequency on the position 7,481,248 and T frequency on the position 7,507,009 were higher in the selection group than in the control group ( Figure 4E). These results suggest that IGF2a and IGF2b genes might be associated with the growth rate of this fish.

IGF2a and IGF2b Knockout in Zebrafish
To test this hypothesis, we used CRISPR-Cas9 to generate IGF2a-knockout and IGF2b-knockout mutant zebrafish lines. We confirmed the successful knockout by sequencing these genes in wild and mutant types; sequence comparison shows a frameshift mutation (Figures 5A,B). Both mutant lines exhibited differences (albeit mostly above the statistical significance threshold) in growth traits in comparison to the wild type (Figures 5C-J). The IGF2a-knockout fish were shorter but heavier and deeper-bodied than the wild type (Figures 5C-F), whereas the IGF2b-knockout fish were also shorter and deeper-bodied, but lighter than the wild type (Figures 5G-J). Intriguingly, for IGF2a these differences in body weight/length ratio were more strongly pronounced in males (the only statistically significant difference; Figures 5F,J). These results indicate that both genes might be involved in growth regulation in zebrafish, but further studies are needed to corroborate this. We hypothesize that nucleotide substitutions might regulate the growth by influencing the transcription factor binding (Laurila and LãHdesmãKi, 2009) or by influencing the protein structure (Perng et al., 1999).

DISCUSSION
Selection on a limited number of markers for quantitative traits often misses a substantial portion of the genetic variance contributed by loci with small effects (Poland et al., 2012). As a result of this, although genomic breeding facilitates direct selection for favorable alleles in selective breeding schemes, it is not considered to be cost-effective due to a large number of genome-wide markers required for its successful application. Usually, high-density or low-density assays were used in animal breeding, with such examples as Bovine SNP50 chip with 54,001 SNP loci to support genome wide association applications in cattle (Matukumalli et al., 2009), and BovineLD chip with 6,909 SNPs, which aimed to facilitate low-cost GS for taurine in beef and dairy cattle (Boichard et al., 2012). A combination of the two (high-and low-density assays) was proposed as a promising strategy for the implementation of GS at acceptable costs, where a panel size of 384 markers could be recommended for the selection of candidates for pig breeding programs if at least one parent of selection candidates is genotyped at high-density (Wellmann et al., 2013). In common carp, although a high-density array with 250,000 genome-based SNPs had been developed (Xu et al., 2014a), this array was only used to detect the QTL for growth and sex-related traits (Peng et al., 2016). However, a cost-effective combination of high-density SNP genotyping platform and capturing genome-wide SNPs for a large number of specimens remains unavailable in practice. In order to explore the candidate molecular markers for GS in the Xinlong strain, we mapped the total of 56,663 SNPs on the genome-wide level for parents and their offspring using the 2b-RAD methodology. To ensure that our results have broad applicability, specimens from 25 different families were selected to form the selection and control groups. Genetic diversity analysis indicates that selection line was as genetically diverse as the heterogeneous sample set of the control line. Thus, selection did not cause a major genetic diversity loss in the Xinlong strain, and we can continue the breeding program for this population.

Features of QTL and Fst Methods in Exploring the Candidate Loci
As previous QTL analyses in common carp relied on data from only one family (Laghari et al., 2014;Lu et al., 2015), it remains unclear whether these QTL are applicable to multifamily studies, or whether this approach identified only a small proportion of useful QTL due to high genetic relatedness of the sample. Moreover, as Lv et al. found that growth-related (BW, TL, and BT) QTL only partially overlapped between families in their multifamily study, this indicates that a single-family approach does not necessarily produce results applicable across different families (Lv et al., 2016). Therefore, a multifamily approach is needed to study growth-related QTL in the Xinlong strain. In comparison to QTL, an advantage of the fixation index estimation is that it can also examine the candidate genetic regions with selective signals. Flori et al. identified 13 regions subjected to strong and/or recent positive selection in cows by estimating Fst values, and found that some of them contained genes with a strong effect on milk production (Flori et al., 2009). Moon et al. reported that genes with strongest signals of directional selection for traits important for pig reproduction and production under artificial selection belong to the group III of metabotropic glutamate receptors (Moon et al., 2015). Similarly, candidate SNPs should be tested for their across-population applicability. Zhao et al. applied two complementary methods to detect selection signatures in seven different cow breeds: Fst was used to detect 704 individual SNPs, and results then compared to the integrated haplotype score (iHS, a measure of the amount of extended haplotype homozygosity at a given SNP along the ancestral allele relative to the derived allele) (Zhao et al., 2015). This indicates that Fst can be used to determine the candidate SNPs, or genes, or signals of selection, between selection and control groups in the Xinlong strain breeding program. In summary, the QTL method is sensitive to population structure and has lower resolution, while the Fst method has higher resolution. Using the traditional (non-highthroughput) methods of molecular marker data collection (such as microsatellites and restriction fragment length polymorphisms), QTL is a suitable method for target species without a published reference genome. In order to avoid identifying the candidate QTL specific to only one family, we used 25 families and the cost-effective 2b-RAD genotyping capture method to examine candidate molecular markers in the Xinlong breeding population. For this, we modified a method recently developed to estimate the QTL of a limited number of individuals belonging to 8 families . This way, the identified QTL should be applicable to other studies of this strain. In order to further increase the resolution, Fst analysis was also used to study candidate markers inferred from genome-wide SNPs. In order to avoid false positive errors, gene knockout technology was used to verify the candidate genes identified by overlapping the QTL and Fst approaches.

QTL Discovered
Using 25 full-sib families of the Xinlong strain and high throughput sequencing, we identified 18 QTL associated with growth traits, which contributed to genetic variance from 7 to 100%. One QTL located on the chromosome 48 was shared among three growth indices (Blen8m, Bdp8m, and Bwd8m), which may partially explain the interconnectedness of these growth traits in common carp . Although there exists a strong correlation in growth curve dynamics between different growth and developmental stages (Ibáñez-Escriche and Blasco, 2011), which indicates that growth traits in different developmental stages should be controlled by the same QTL (Bartholomé et al., 2013;Würschum et al., 2014), we could not identify any body weight-associated QTL shared between different growth stages. This illustrates that further studies are needed in this aspect. Peng et al. identified 22 QTL for growthrelated traits at 18 months of age on the basis of chromosomewide log odds score (LOD; with significance threshold set at P < 0.01), and found that they can explain 15.5-45.1% of the total phenotypic variation (Peng et al., 2016). Compared to our research, they merely reported candidate QTL regions and some genes in these regions, but they did not study SNP positions and their contribution to the phenotypic variance. Beside this, we also determined whether a specific growth trait is controlled by a major QTL effect or a polygenic effect using the StepLMM approach.
Neural Regulation May be a Major Component Behind the Higher Growth Performance of the Selection Line To our knowledge, definitive genes or loci that can explain the growth performance variance mostly remain unavailable for carp breeding programs. Using gene chip information, Peng et al. identified candidate regions containing many important growth performance regulators, such as KISS2, IGF1, SMTLB, NPFFR1, and CPE (Peng et al., 2016). In our study, several candidate genes for QTL related to growth performance of the Xinlong strain were determined, such as SHANK2 and MDGA2 for iBwt and Bwt8m, respectively. The former gene, SHANK2, was associated with both excitatory and inhibitory synapses (Vaccarino et al., 2009), whereas a mutation in MDGA2 was shown to elevate excitatory transmission, implying that MDGA2 blocks neuroligin-1 interaction with neurexins and suppresses excitatory synapse development (Connor et al., 2016). A direct connection between body weight and neural regulation was reported in recent years: multiple neural circuits which can affect the body weight by regulating the balance between food intake and energy expenditure were identified in the brain (Rui, 2013;Rathjen et al., 2017). Two loci were identified near the cell adhesion molecule 1 and 2 genes (they encode membrane proteins that mediate synaptic assembly); these two loci influenced the expression of these two genes, thereby disrupting energy balance and promoting weight gain (Rathjen et al., 2017). As regards brain-derived neurotrophic factor-like (BDNF), which is important for synaptic plasticity, learning and memory production, the serum area under the curve of BDNF significantly decreased after body weight reduction for obese men (Lee et al., 2016a), whereas a reduction in body weight and improved glucose tolerance in obese mice were accompanied by an increase in BDNF levels .
In common carp, sex has a strong impact on growth performance, as females generally have higher body weight than males. In the present study, a gene named NRXN1A was identified as a QTL related to Blen8m. In zebrafish, this gene is expressed in the adult testis and in the earliest stages of development, before the beginning of zygotic transcription (Rissone et al., 2007). Therefore, sex-specific regulation may be involved in the higher Blen8m in the selection line.

Candidate Loci Identified by Both QTL and Fst Analyses
The combination of genome-wide selective signatures and QTL information revealed that genetic differences between the Xinlong strain and the control line could be roughly classified into three categories of pathways. (1) Neural pathways: RHEB (Yang et al., 2014b;Meng et al., 2017) and BDNF are involved in nutritional regulation, including food intake and peripheral metabolism (Meng et al., 2017). MDGA2, NRXN1a, SHANK2 and REEP2 genes are involved in physical excitatory transmission process (Vaccarino et al., 2009), while LNX1 is involved in neuromuscular junction (Vaccarino et al., 2009). Finally, vasoactive intestinal peptide receptor 2 (VPAC2) regulates the muscle mass and force (Ilegems et al., 2010;Hurt et al., 2014), which is directly related to growth performance. (2) Sex pathways. Genes implicated in these pathways can be divided into two main subcategories: gonad development and sex hormone regulation of growth stimulation. Three genes are associated with the former subcategory: NRXN1A, SMOC2, and RAB10 are involved in gonadogenesis, specifically spermiogenesis and testes development (Young et al., 2005). Two genes from the same family are associated with the latter category: HSD17B3 is a male reproductive marker (Pazin and Albrecht, 2009) which can take part in the production of the male sex hormone, testosterone, from a precursor hormone called androstenedione, and HSD17B8 is involved in the conversion of estradiol (E2) to estrone (E1) (Lin et al., 2016). (3) Fatty acid metabolism regulation pathways: RAB10 is involved in insulin-stimulated glucose uptake in adipocytes (Lee et al., 2016b;Senol-Cosar et al., 2016). Tenomodulin (TNMD) is associated with adipocyte differentiation and beneficial visceral adipose tissue expansion (Villar et al., 2007). As opposed to the above fat depositionrelated genes, VPAC2 has lipolytic effects (Karunanithi et al., 2014); which suggests that the faster growth of the Xinlong strain is a consequence of a complex bidirectional balance of fat metabolism regulation, rather than unidirectional fat depositionpromoting metabolic control. Finally, ACSF2 is associated with the long-chain fatty acid metabolism (Sano et al., 2015), which is in agreement with the higher DHA content observed in the muscles of Xinlong strain . Finally, some of the identified genes were associated with fat metabolismrelated economic traits: TMC7, neurexin 3 (the same gene family as NRXN1a) and COL23A1 were associated with milk production of Holstein cows using genome-wide SNPs (Lee et al., 2016b).

IGF2a Exhibits Sex-Biased Growth-Promoting Effects
IGF2a knockout male zebrafish had lower body length and depth than the wild type, but female fish did not exhibit a significant difference. The result verified that normally active paternal allele of the IGF2 gene can promote growth in mammals (Li et al., 1993;Silva et al., 2006). Similarly, in pigs, a paternally expressed QTL affecting skeletal and cardiac muscle mass was located in the IGF2 locus (Jeon et al., 1999). By constructing placental-specific IGF2 knockout (P0) mice, Dilworth et al. found that fetal and placental weights in P0 mice were reduced when compared with WT at days 17 and 19 of the embryonic development. Beside this, P0 mice exhibited hypocalcemia and increased placental transport of calcium from the embryonic day 17 (E17) and E19 (Dilworth et al., 2010), which indicates that calcium transport in the intestines can be affected by gonadal hormone status in sexually maturing male rats (Hope et al., 1992). This suggests that IGF2a might be promoting growth by regulating calcium transport.
In conclusion, cost-effective genome wide molecular markers were explored using 2b-RAD, and a limited number of SNPs were identified by merging results of QTL and Fst analyses. These loci were located in two growth performance-associated candidate genes, IGF2a and IGF2b. Involvement of these genes in growth regulation in fish was further studied by generating Crispr-Cas9 knockout zebrafish lines. Although results indicate that both genes might be involved in growth regulation, further studies are needed to corroborate this. Regulatory mechanisms of body weight dynamics remain only partially understood, and it remains unclear why candidate loci for growth performance differ between developmental stages. In the follow-up studies, we aim to combine genomic and transcriptomic information to address this issue.