Genome-wide association study reveals markers and candidate genes associated with growth in the rice flower carp, an economic fish species of integrated rice-fish culture in China

The rice flower carp (Cyprinus carpio) is an important fish in integrated rice-fishery farming. Here, we performed the first genome-wide association study (GWAS) for seven growth traits (including body mass, total length, body length, body height, body width, caudal-peduncle depth, and eye spacing) in 200 rice flower carp samples using 369,688 high-quality SNPs and 42,225 indels obtained by double-digest genotyping-by-sequencing (ddGBS). The morphometrics of these traits were highly correlated (Pearson’s correlation coefficients = 0.74–0.99, p < 0.001). GWAS detected 15, 5, 4, 26, 7, 16, and 17 loci significant associated (-log10P ≥ 5) with body mass, total length, body length, body width, body height, caudal-peduncle depth, and eye spacing, respectively. Subsequently, within the 50 kb upstream and downstream regions surrounding these significant loci, 38, 19, 18, 20, 52, 27, and 37 candidate genes for the seven growth traits were detected, respectively. Importantly, B6_4352672 and A8_4978825 were significantly associated with more than five growth traits. These results showed loci significantly associated with more than five growth traits will be helpful for future marker-assisted selection (MAS). Interestingly, chromosomes A8 and B25 had many loci significantly associated with growth traits, most of which were shared among multiple growth-related traits. These results indicated that chromosome A8 and B25 may be closely related to growth traits. Our findings not only help understand the genetic architecture of growth traits in fish but facilitate the identification of candidate genes for marker-assisted selection towards breeding faster-growing rice flower carp in the future.


Introduction
The rice flower carp (Cyprinus carpio), belonging to the family Cyprinidae, is an essential economic fish species in the integrated rice-fish culture in China, which has been cultured in paddy fields for over 1000 years, and the Quanzhou rice flower carp of Guangxi Zhuang Autonomous Region in China was also listed as an imperial tribute in the Qing Dynasty (Jiang and Yan, 2009). Through longterm cultivation and artificial selection in the rice field environment, the rice flower carp gradually formed unique morphological characteristics of body color and shape, which distinguishes it from other cultured common carp (Cyprinus carpio) groups (Wu et al., 2021). Because the rice flower carp eat the rice flower, they have many desired qualities, such as high protein content, unique flavor with no muddy smell, and soft and edible intermuscular spines (Yang and Jiang, 2009). These qualities promote the development of integrated rice-fish culture and increase the demand and commercial value. However, the rapid expansion of rice flower carp aquaculture has led to tremendous challenges, including slow growth rates, large individual differences, and uneven listing specifications, which have seriously limited the sustainable development of this species' industry . Therefore, with increasing market demand, selection for fast growth would be one of the most important breeding goals for the rice flower carp.
Molecular marker-assisted breeding can increase the speed of breeding by nearly 11% compared with traditional methods (Gomez and Klemetsdal, 1999). In addition, the heritability of body mass, body length, and body width of rice flower carp were 0.45, 0.42, and 0.36, respectively , suggesting that molecular marker-assisted breeding can be used for the selection of superior strains with a rapid growth rate of rice flower carp because of the high heritability of growth traits. Single nucleotide polymorphisms (SNPs), the most reliable third-generation molecular marker, have been widely used in candidate-marker/ gene association studies of important traits, and molecular markerassisted breeding (Vignal et al., 2002). Identification of SNP markers associated with target traits is key for molecular markerassisted breeding.
Currently, genome-wide association studies (GWAS) based on a large number of SNP markers have been widely used for identifying markers/genetic variants that are significantly associated with various economic traits in fish, such as sex, stress tolerance, and growth (Berghof et al., 2018;Ji et al., 2019;Gao and Yang, 2021;Luo et al., 2022;Mou et al., 2022). In recent years, SNPs have mainly been detected and genotyped at the genome-wide level using gene chips, simplified genome sequencing, and whole genome resequencing. In reduced-representation genome sequencing (RRGS), high-density molecular markers have been obtained at the level of the whole genome (Miller et al., 2007). Genotyping-bysequencing (GBS) is a simplified genome sequencing method that is inexpensive and reliable (Davey et al., 2011;Bian and Wang, 2017), and is widely used in population genetics (Nunez et al., 2015;Drury et al., 2016) and genome-wide association studies (Wu et al., 2019;Habimana et al., 2021). With the development of GBS technology, double-digest genotyping-by-sequencing (ddGBS) can effectively reduce genome complexity and make genome-wide genetic analysis detection more convenient and reliable (Elshire et al., 2011). In recent years, ddGBS genotyping has been applied for SNP marker development (Jia et al., 2020;Liu et al., 2022), genetic linkage map construction , and GWAS (Gileta et al., 2022). However, the study of growth-associated marker of rice flower carp using GWAS is lacking so far.
In this study, 200 rice flower carp individuals of a population were sequenced by ddGBS, and GWAS was first conducted based on mixed linear model for identify seven growth-traits (including body mass, total length, body length, body height, body width, caudal-peduncle depth and eye spacing) associated markers and candidate genes. The findings of this study will be helpful for future marker-assisted selection (MAS) and speculating on the genetic mechanisms underlying the growth regulation of rice flower carp.

Ethics statement
All experiments in this study were conducted in strict accordance with the ethical guidelines and regulations of the Animal Care and Use Committee of Southwest University and were approved by the local ethics committee of the Animal Welfare and Ethical Committee of Southwest University.

Sample preparation
The experimental fish parents were obtained from Quanzhou County, Guilin City, Guangxi Zhuang Autonomous Region, and 120 pairs of rice flower carp (1:1 male to female ratio) were introduced from the Guangxi Agricultural Fine Variety Hainan South Propagation and Breeding Base, San Ya, China. The experimental fish were bred after artificial fertilization. Two hundred 20-month-old fish were randomly collected for measuring seven body size traits, including body mass (BM), total length (TL), body length (BL), body height (BH), body width (BW), caudal-peduncle depth (CPD), and eye spacing (ES). The tail fins of each individual were collected and preserved in 100% ethanol at 4°C for DNA extraction. Genomic DNA of each sample was extracted using DNAsecure Plant Kit (TIANGEN) according to the manufacturer's protocols. The quantity and quality were assessed using 2.0 Flurometer (Life Technologies, CA, USA) and 1.5% agarose gel electrophoresis, respectively.

GBS library construction and sequencing
Genomic DNA from each sample was used to prepare the GBS library with the NEB Next ® Ultra DNA Library Prep Kit for Illumina ® (NEB, USA). In brief, 100 ng genomic DNA from each fish was digested with EcoRI and NlaIII at 37°C for 1 h, and then added End Prep Enzyme Mix at 20°C for 30 s, 65°C 30 s for terminal repair, phosphorylation and addition of A-tail. The double DNA fragments were mixed with EcoRI adapter, NlaIII adapter and T4 DNA ligase at 20°C for 1 h, then added 3 mL USER enzyme at 37°C for 15 min. The DNA fragments with 400-600 bp filtered by AMPure XP beads were used to construct PCR library using universal PCR primers and index primers. After purifying and quality inspection, sequencing was carried out on an Illumina HiSeq 4000 platform.

Analysis of the population structure
Neighbor-joining (NJ) phylogeny construction was performed by MEGA-X (Kumar et al., 2018) with model of p-distance, bootstrap runs: 500. Population structure of the 200 fish was calculated by Admixture (v 1.3) (Alexander et al., 2009) with maximum likelihood approach. The putative number of subgroup (K-value) was set from one to nine, and the optimal K-value was determined by the lowest cross-validation error. The genetic composition of each sample in subgroup was mapped using Pophelper (v 2.2.7) (Francis, 2017). Principal component analysis (PCA) and kinship analysis of the population were performed using GCTA (v1.93.2) with the default settings (Yang et al., 2011).

Genome-wide association study
Pearson correlation coefficients between seven traits and principal component analysis (PCA) were calculated using R (v 3.5.0) (https://www.R-project.org/) and SPSS20.0, respectively. Principal components were extracted according to the principle of an 85% cumulative contribution rate. Seven growth-associated traits from 200 individuals were selected for GWAS analysis using GEMMA (v 0.98.1) based on mixed linear model (MLM) (Zhou and Stephens, 2012). The p value of genome-wide significance threshold was calculated by Bonferroni correction based on the formula p value = 0.05/n, n represents the number of markers used for GWAS. Manhattan plot representing the -log 10 (p) for threshold of extremely significant association and significant association were set as 6.9 and 5, respectively. Quantile-quantile (Q-Q) plot expressing the observed and expected -log 10 (p) values were produced using R software.

Identification of candidate genes
The candidate genes associated with growth were obtained by mapping significant markers onto the common carp genome (GCF_018340385.1) (Li et al., 2021) with scanning up-and down-stream 50 kb regions from the significant markers. Then, obtained candidate genomic regions were mapped to the contig by local sequence similarity searches and annotated with an inhouse script.

Statistics of growth traits
In this study, seven morphological traits for growth traits of 200 rice flower carp were measured, as shown in Table 1. The mean values of BM, TL, and BL traits (± SD) were 638.38 ± 302.33 g, 32.72 ± 4.37 cm, and 24.60 ± 3.52 cm, respectively. Pearson's correlation coefficients between all pairs of the seven morphological traits ranged from 0.74 to 0.99 ( Figure 1). These seven traits were strongly and positively correlated with each other.

Sequencing data and population structure
Using the Illumina sequencing platform, a total of 1,919,365,466 clean reads and 270.12 Gb clean data with 43.15% GC content were obtained from 200 rice flower carp. The average Q20 and Q30 were 97.32% and 95.02%, respectively (Table S1). After mapping all clean reads on to the common carp genome (GCF_018340385.1) (Li et al., 2021) and filtering low-quality SNPs, a total of 369,688 high-quality SNPs and 42,225 indels were used to analyze the population structure and GWAS.
To understand the genetic relationships among these experimental individuals, we conducted phylogenetic tree, population structure, PCA, and kinship analyses. Phylogenetic tree analysis indicated that these 200 rice flower carp were mainly divided into nine branches ( Figure 2). In the population structure analysis, the optimal K-value (K = 9) was determined using the lowest cross-validation error ( Figure S1). The results of the population genetic structure and PCA are shown in Figure S2 and Figure S3, respectively. Apparently, the two analysis methods yielded consistent results. The genetic relationship coefficient ranged from -0.24 to 0.69 ( Figure S4).

GWAS of growth traits
Based on 369,688 high-quality SNPs and 42,225 indels, GWAS of seven growth-associated traits from 200 individuals was carried out using GEMMA (v 0.98.1) based on a mixed linear model (MLM). The results showed that five extremely significant associated (-log10P ≥ 6.9) ( Table 2) and 85 significant associated (-log10P ≥ 5.0) loci with growth-associated traits (Table S2, Figure 3). Among these loci, one locus located on chromosome B6, two loci located on chromosome B25 and A3, and two loci located on chromosome B25 and B6 were extremely significantly associated with BM, BW, and ES, respectively (Table 2). Additionally,15,5,4,26,7,16, and 17 loci were significant associated with BM, TL, BL, BW, BH, CPD, and ES, respectively. BW had the greatest number of extremely significant and significant loci, followed by ES and BM. Interestingly, three chromosomes of A3, B6, and B25 had significant loci associated with growth traits, in which chromosome B25 has more significant loci than other chromosomes, except for chromosome A8. Fifteen loci were located on chromosome A8, and locus A8_10388533 was associated with BM, TL, BL, and ES (Table 3). Among the loci located on chromosome B25, B25_2114981 was shared among three traits including BM, BW, and ES. B25_556773, B25_851845, B25_963540, and B25_963625 were related to BW and ES traits (Table 4).

Candidate genes related to growth traits
Genes potentially related to the seven growth-associated traits in rice flower carp were explored after annotating the upstream and downstream 50 kb genome sequences based on significant loci. The number of candidate genes associated with the seven growth traits was 38,19,18,20,52,27, and 37 for BM, TL, BL, BH, BW, CPD, and ES, respectively (Table S3). Based on the analysis above, we found that 14 and 18 genes harboring significant loci were associated with growth traits in chromosomes A8 and B25, respectively (Table 5). A total of 40 candidate genes were associated with at least two growth traits, of which five candidate genes, including tripartite motif containing 25 (trim25l), trafficking kinesin protein 2 (trak2), STE20 related adaptor beta (stradb), neurobeachin like 1 (nbeal1), and RNA binding motif protein 26 (rbm26), were shared among the seven traits. Ubiquitin carboxyl-terminal hydrolase 11 (usp11), ER membrane protein complex subunit 3 (emc3), UXT (uxt), ETS domain-containing protein Elk-1 (elk1), leucine-rich repeat neuronal protein 1 (lrrn1), and OTU domain-containing protein 5-A (otud5a) were shared among the five traits (BM, TL, BL, BW, and ES).

Discussion
The growth properties of the rice flower carp, an economically important fish species in integrated rice-fishing culture in China, affects fish production and economic value. In the present study, the first GWAS for growth traits in rice flower carp was conducted to identify markers and genes related to growth traits. A total of 90 loci that were significantly related to growth traits were detected in multiple chromosomes (Table S3), indicating that growth traits were complex quantitative traits controlled by multiple loci and multiple genes (Löhr et al., 2018;Yin et al., 2020). Similar results have been reported for many aquatic animals. For example, in Takifugu rubripes, 394 SNPs associated with growth traits were distributed among all 16 chromosomes (Wang, 2022); and in Epinephelus lanceolatus, a total of 36 growth-related SNPs spread over 14 chromosomes were detected (Wu et al., 2019).
The location of loci showed 31 loci associated with growthrelated traits were concentrated on chromosomes A8 and B25, which indicated that chromosomes A8 and B25 may be closely related to growth traits of rice flower carp. Similar results have been reported in many fish species. In common carp, a total of 10 SNP loci distributed on LG8 for body mass were identified based on an ultra-high density linkage map and QTL mapping (Peng et al., 2016). In Larimichthys longirostris, the most obvious relationship to growth traits were located on chromosome 16 (Mou et al., 2022). Most growth-associated loci were clustered on chromosomes A8 and B25, suggesting that these two chromosomes may be central to The phylogenetic tree of 200 rice flower carp. Luo et al. 10.3389/fmars.2023.1130667 Frontiers in Marine Science frontiersin.org identify growth-related genes for rice flower carp. In this study, 14 and 18 candidate genes were identified surrounding these loci in chromosomes A8 and B25, respectively. Among them, some candidate genes were involved in the regulation of the cell cycle and cells activity. Fizzy-related protein homolog (fzr1), a critical cell cycle regulator, plays a role in regulating the cell cycle by participating in the ubiquitin proteasome pathway (Manchado et al., 2010;Lu, 2013). Usp11 regulates BRCA2 expression levels by deubiquitinating it and participates in the cellular response to DNA damage (Schoenfeld et al., 2004). Uxt is a component of centrosome and is essential for cell viability (Zhao et al., 2005). Lrrn1 could suppresses GC cell apoptosis by downregulating AP-1 . And some candidate genes were associated with visual function, innate immunity, osteogenic differentiation, and metabolic activity. Mutation of the zebrafish emc3 homolog, partial optokinetic response b (pob), causes the degeneration of photoreceptor cells (Taylor et al., 2005). Metabotropic glutamate receptor 6-like (mglur6) mediates excitatory chemo-synaptic transmission, and is a critical site for regulating information transmission in the optic rod pathway (Xu et al., 2000). Otud5 negatively regulates natural immunity by inhibiting the production of type I interferons (Kayagaki et al., 2007). Elk1 has been reported to be involved in osteoblast differentiation (Kousteni et al., 2003;Wiren et al., 2004;Kim et al., 2014). Nuclear factor 1 C-type-like isoform X1 (nfic) is associated with organismal osteogenesis (Zhang, 2017). Coiled-coil domain-containing protein 92 (ccdc92) is a protein-coding gene, and GWAS of several dyslipidemia and coronary heart disease in humans showed that the ccdc92 gene is localized in or adjacent to a susceptible region, including triglycerides and high-density lipoprotein cholesterol (Teslovich et al., 2010;Willer et al., 2013;Gupta et al., 2015;Spracklen et al., 2018). Previous research has shown that a multi-trait comprehensive breeding strategy should be adopted for the breeding of rice flower carp . In this study, loci B6_4352672 was shared among seven growth-related traits, and A8_4978825 was shared among five growth-related traits, suggesting that B6_4352672 and A8_4978825 may be suitable for future molecular marker-assisted selection (MAS). Five candidate genes were identified surrounding B6_4352672, including trim25l, trak2, stradb, nbeal1, and RNAbinding motif protein 26 (rbm26). Trim25 is an E3 ubiquitin ligase involved in regulating the innate immune response against viruses (Martıń-Vicente et al., 2017). In aquatic animals, Trim25 has been reported to play an important role in antiviral innate immunity in common carp (Palaiokostas et al., 2018), Epinephelinae (Yang et al., 2016), and Danio rerio (Jin et al., 2019). Nbeal1 is a Golgi-associated protein required for the regulation of cholesterol metabolism (Bindesbøll et al., 2020).
In addition to the candidate genes mentioned above, there are some candidate genes that are involved in fat uptake, myogenesis, growth, and immunity. Parkin RBR E3 ubiquitin protein ligase (prkn) encodes the E3-ubiquitin ligase PARKIN, which regulates systemic fat uptake via ubiquitin ligasedependent effects (Kim et al., 2011) and changes ubiquitin labeling of cell cycle proteins (Wahabi et al., 2018). Adiponectin receptor protein 2-like (adipor2), an important adipocytokine, is associated with feeding, reproduction, immunity, and glycolipid metabolism (Kubota et al., 2007;Dobrzyn et al., 2018;Peng et al., 2018;Jian et al., 2019). Tolllike receptor 13 (tlr13) has been reported to recognize bacterial Manhattan plot for seven growth traits of rice flower carp. Dots with different colors denote loci on different chromosomes. The blue horizontal line represents the significance threshold (5.0), and the red line represents the extremely significance threshold (6.9). The red horizontal line represents the significance threshold (5.0), and the blue line represents the extremely significance threshold (6.9).
RNA, and Epinephelus coioides tlr13 is involved in recognizing the RNA of Vibrio parahaemolyticus and can induce the production of inflammatory cytokines (Zhou, 2017). Ubiquitin carboxyl-terminal hydrolase 49-like (usp49) negatively regulates cellular antiviral responses by deconjugating K63-linked ubiquitination of MITA (Ye et al., 2019). Sox-9-B-like (sox9b) was associated with skeletal development (Dalcq et al., 2012), pancreatic duct development (Manfroid et al., 2012), gonadal development (Meyer et al., 1997;Chiang et al., 2001;Chiang et al., 2001) and retinal development (Yokoi et al., 2009). Interestingly, many ubiquitin proteasome pathway-associated genes were detected, including ubiquitin carboxyl-terminal hydrolase FAF-X (usp9x), usp11, trim25l, ubiquitinconjugating enzyme E2 D4 (ube2d4), prkn, usp49, ubiquitin carboxyl-terminal hydrolase 12-like (usp12), E3 ubiquitinprotein ligase RNF19B-like (rnf19b), and ubiquitin, which is similar to the previous study on the muscle transcriptome of rice flower carp with different sizes . This suggests that  the ubiquitin-proteasome pathway is involved in the growth regulation of rice flower carp. In this study, the function of most candidate genes may play important roles in the growth of rice flower carp should be determined in the future studies.

Conclusion
Rice flower carp is an important commercial fish species used in integrated rice-fishery farming. Improving the growth ratio using growth-trait associated markers could increase supply to match increasing market demand. In this study, 411,913 loci were obtained from the reduced-representation genome of rice flower carp and used to conduct a genome-wide association analysis to identify growth-trait associated markers. A total of 90 loci were significantly associated with growth traits in the rice flower carp. A series of candidate genes involved in metabolism, immunity, osteogenic differentiation, and visual function were extracted near the significant loci, including ccdc92, nbeal1, prkn, adipor2, trim25, tlr13, usp49, otud5, elk1, and nfic. This study provides important insights into the genetic architecture of growth traits in this fish species, as well as facilitate marker-assisted selection for growth improvement in rice flower carp.

Data availability statement
The data presented in the study are deposited in the Genome Sequence Archive in National Genomics Data Center, China National Center for Bioinformation, Chinese Academy of Sciences repository, accession number PRJCA014203.

Ethics statement
All experiments in this study were conducted in strict accordance with the ethical guidelines and regulations of the Animal Care and Use Committee of Southwest University and were approved by the local ethics committee of the Animal Welfare and Ethical Committee of Southwest University.