Investigation of the Genetic Diversity and Quantitative Trait Loci Accounting for Important Agronomic and Seed Quality Traits in Brassica carinata

Brassica carinata (BBCC) is an allotetraploid in Brassicas with unique alleles for agronomic traits and has huge potential as source for biodiesel production. To investigate the genome-wide molecular diversity, population structure and linkage disequilibrium (LD) pattern in this species, we genotyped a panel of 81 accessions of B. carinata with genotyping by sequencing approach DArTseq, generating a total of 54,510 polymorphic markers. Two subpopulations were exhibited in the B. carinata accessions. The average distance of LD decay (r2 = 0.1) in B subgenome (0.25 Mb) was shorter than that of C subgenome (0.40 Mb). Genome-wide association analysis (GWAS) identified a total of seven markers significantly associated with five seed quality traits in two experiments. To further identify the quantitative trait loci (QTL) for important agronomic and seed quality traits, we phenotyped a doubled haploid (DH) mapping population derived from the “YW” cross between two parents (Y-BcDH64 and W-BcDH76) representing from the two subpopulations. The YW DH population and its parents were grown in three contrasting environments; spring (Hezheng and Xining, China), semi-winter (Wuhan, China), and spring (Wagga Wagga, Australia) across 5 years for QTL mapping. Genetic bases of phenotypic variation in seed yield and its seven related traits, and six seed quality traits were determined. A total of 282 consensus QTL accounting for these traits were identified including nine major QTL for flowering time, oleic acid, linolenic acid, pod number of main inflorescence, and seed weight. Of these, 109 and 134 QTL were specific to spring and semi-winter environment, respectively, while 39 consensus QTL were identified in both contrasting environments. Two QTL identified for linolenic acid (B3) and erucic acid (C7) were validated in the diverse lines used for GWAS. A total of 25 QTL accounting for flowering time, erucic acid, and oleic acid were aligned to the homologous QTL or candidate gene regions in the C genome of B. napus. These results would not only provide insights for genetic improvement of this species, but will also identify useful genetic variation hidden in the Cc subgenome of B. carinata to improve canola cultivars.


INTRODUCTION
Ethiopian mustard (Brassica carinata A. Braun 2n = 4x = 36, genomes BBCC) is an important member of the family Brassicaceae and is commercially grown for edible vegetable oil, and as vegetable crop in Ethiopia. This crop is also being exploited for biodiesel as a source of a renewable energy (Warwick et al., 2006). It consists of the two homologous genomes, B and C, and may have originated as an allotetraploid species as a result of spontaneous hybridization between diploid species; Brassica nigra (2n = 2x = 14, genome BB) and Brassica oleracea (2n = 2x = 18, genome CC) in Ethiopia (Nagaharu, 1935;Lukens et al., 2004;Warwick, 2011). B. carinata harbors useful genes for resistance to abiotic and biotic stresses (Getinet et al., 1996), and therefore has been used as a donor for introgression of genes to improve and widen the gene pool of Brassica rapa, Brassica napus, and Brassica juncea germplasm (Meng et al., 1998;Xiao et al., 2010;Wei et al., 2016).
Relative to its other close relatives, B. rapa, B. napus, and B. juncea, genetic improvement of this crop has been limited due to lower grain yield, "poor" canola oil quality attributes and unavailability of genetic tools and resources. So far, the genomes of B. rapa, B. oleracea, B. napus, B. nigra, and B. juncea have been sequenced and published (Wang et al., 2011;Chalhoub et al., 2014;Liu et al., 2014;Parkin et al., 2014;Yang et al., 2016), while the genome of B. carinata is sequenced but has not been published (Parkin, per comm.).
Research has shown that a limited genetic variation exists in B. carinata (Jiang et al., 2007;Guo et al., 2012). In order to increase the rate of genetic gains in breeding programs, QTL (quantitative trait loci) mapping has been successfully used to identify both qualitative and quantitative loci in several crops including vegetable and oilseed brassicas. QTL for various agronomic traits such as grain yield and its components, flowering time, seed quality, and tolerance to biotic and abiotic stresses have been identified in B. rapa (Rahman et al., 2014;Hirani et al., 2016), B. juncea (Singh et al., 2012;Bagheri et al., 2013;Dhaka et al., 2016) and B. napus (Snowdon and Friedt, 2004;Shi et al., 2009;Zou et al., 2010;Yong et al., 2015;Raman et al., 2016). However, reports on such analyses in B. carinata has been very limited. A genetic linkage map of the doubled haploid (DH) population (named as YW population) derived from a cross between two DH accessions of B. carinata, Y-BcDH64 (yellow petal) and W-BcDH76 (white petal), was constructed using a total of 212 amplified fragment length polymorphism (AFLP), intron based polymorphism (IBP), sequence-related amplified polymorphism (SRAP), and simple sequence repeats (SSR) markers, and covered 1,703 cM with a marker density of 8 cM between adjacent loci (Guo et al., 2012). In addition, a high density genetic linkage map of the YW DH population integrating existing 212 markers (Guo et al., 2012) and 3,819 presence-absence markers based on genotyping-by-sequencing, DArTseq markers (the traditional DArT and next-generation sequencing technique called DArTseq, Raman et al., 2014), was established with increasing genome-wide coverage (2,048 cM) (Zou et al., 2014). The genetic map of YW DH population was subsequently used for the QTL mapping of genetic loci involved in petal color, another tip color, seed coat color (Guo et al., 2012), and flowering and budding time (Zou et al., 2014). However, a limited phenotypic datasets especially on agronomic traits were available for QTL detection and routine marker-assisted selection.
In the present study, we genotyped a set of 81 B. carinata accessions to investigate the genetic diversity, population structure and the linkage disequilibrium (LD) pattern of B. carinata using DArTseq markers. We also conducted genomewide association study (GWAS) to detect associations between markers and six seed quality traits among the 81 B. carinata accessions evaluated in two experiments. We further phenotyped the YW DH population for 14 seed yield, yield-related traits, and oil quality traits across different agro-climatic conditions for QTL mapping, and compared those results with the putative candidate genes or QTL identified in the C genome of B. napus, as well as the associated markers accounting for genotypic variation in seed quality traits of the diverse lines of B. carinata.

Plant Materials
DH mapping population (185 lines) of B. carinata was developed from an F 1 cross between the DH parental lines, Y-BcDH64 and W-BcDH76 (Guo et al., 2012). In addition, a set of 81 diverse accessions (Supplementary Table 1; Jiang et al., 2007) of B. carinata including the parents of the YW DH population, mainly of Ethiopian origin, collected from Centre for Genetic Resources, Wageningen, The Netherlands and Germany, were used for genetic diversity analyses. All accessions were selfed to avoid any cross pollination.
DNA Extraction, Genotyping, and the Physical Position of the Markers DNA from the B. carinata accessions was extracted from bulked young leaf tissue using the DNA extraction Kit (DNeasy Plant Mini Kit, QIAGEN) and sent to DArT P/L (www.diversityarrays.com) for DArTseq based genotyping as described previously . Both presence-absence markers and SNP markers, collectively called "DArTseq markers, " were scored by DArT P/L.
The 69-bp long sequence of the filtered DArTseq markers identified in 81 B. carinata accessions were aligned to the reference genome of B. nigra , B. napus (Chalhoub et al., 2014), and B. oleracea  by Blast using an e-value threshold of e −10 (Altschul et al., 1990) with ≥60 bp match length. The top blast hits for the sequences were assigned as the physical position of the corresponding markers. The blast matches to multiple loci with the same top e-value were considered with multiple positions, and assigned to the class of unassigned markers without unique positions to the reference genome.
In order to compare the QTL detected in the C genome of YW DH population with the QTL or candidate gene identified in B. napus, the physical position of the markers mapping to the C genome of YW DH population were assigned according to the top blast hits for the sequences of each marker against the reference C genome of B. napus (Chalhoub et al., 2014) by Blast using an e-value threshold of e −10 (Altschul et al., 1990) with ≥62 bp match length. If the blast matches to multiple loci had the same top e-value, the locus which showed the same chromosomal localization of the marker as in the genetic and physical map, was considered as the right location of the marker.
Population Structure, Genetic Relatedness, and Linkage Disequilibrium Analysis The population structure was estimated with software Structure 2.3.4 using Bayesian clustering and the admixture model (Pritchard et al., 2000). The number of subgroups (K) was set from 1 to 10 with three independent runs. The optimum number of subpopulations was determined by log likelihood of the data [LnP(D)] and ∆K method described by Evanno et al. (2005). The Q matrix was assigned to a sub-population (Remington et al., 2001). Nei's genetic distance (Nei et al., 1983) was calculated and used for the unrooted phylogeny reconstruction using a neighbor-joining (NJ) method as implemented in Powermaker version 3.25 (Liu and Muse, 2005). Phylogenetic tree was viewed using MEGA 4.0 (Tamura et al., 2007). The markers with unique position were used to estimate the LD of B. carinata. The parameter r 2 (the squared Pearson correlation coefficient) between all pairs of SNP markers was used to estimate LD by the software package TASSEL 5.0 (Bradbury et al., 2007).

Genome-Wide Association Analysis
The genome-wide association analysis with four statistical models was performed by TASSEL version 5.0 (Bradbury et al., 2007). The four models are as follows: the general linear model (GLM) included a naïve model without controlling for population structure; the Q model which controlled for population structure; the mixed linear model (MLM) including kinship and the Q+K model which controlled for both population structure and kinship. The threshold for the significance of associations between markers and traits, P < 8.32 × 10 −5 (P = 1/total markers used; −log 10 (1/12,030) = 4.08) as in  was used in this study.

Field Experiments and Phenotypic Measurements
The field experiments with 185 DH lines of the YW population and two parental lines were conducted in randomized complete block designs, with three replicates except for 2010 in Wuhan where only two replicates were followed. The trials were conducted in three contrasting environments: (i) spring [two experiments, one each at Hezheng (Gansu province, China) and Xining (Qinghai Province, China)], (ii) semi-winter at Wuhan, representing the major rapeseed growing zone in China, and (iii)  For Chinese environments, each plot was 3.0 m 2 and 30 plants were planted in three rows with three replicates, with a distance of 40 cm between rows and 25 cm between individual plants. For Australian environment, trials were sown in pots under the birdcage at Wagga Wagga Agricultural Institute, NSW, Australia in 2013 and 2014. The trials consisted of two replications of four rows and 100 column pot array with each replication consisting of two rows and 100 column pot array. Five plants were sown per pot.
Seed yield, seven yield related traits (flowering time, pod width, pod length, seed number per pod, seed weight, pod number on main inflorescence, and length of main inflorescence) and six seed quality traits (protein content, oil content, erucic acid, linolenic acid, linoleic acid, and oleic acid) were investigated for QTL mapping in this study. The six seed quality traits (protein content, oil content, erucic acid, linolenic acid, linoleic acid, and oleic acid) were also measured on 81 diverse accessions of B. carinata that were used for GWAS. The seed yield and related traits were measured as described by Shi et al. (2009). Genetic variation in seed and oil quality traits was determined by Near-Infrared Reflectance Spectroscopy (NIR) method (Gan et al., 2003). Compared to other traits investigated in this study, only flowering time was determined in three different contrasting environments. We reanalyzed QTL for flowering time that were determined in a previous study (Zou et al., 2014), with the data collected from Australian Spring environment (WW13 and WW14).

Statistical Analyses
The broad sense heritability was calculated as h 2 = σ g 2 /(σ g 2 + σ ge 2 /n + σ e 2 /nr), where σ g 2 is the genetic variance, σ ge 2 is the variance representing genotype by environment interactions and σ e 2 is the error variance, n is the number of environments and r is the number of replications. Analysis of variance (ANOVA) was performed using σ g 2 , σ ge 2 , and σ e 2 estimated by using a Proc general linear model (GLM) in SAS software (SAS Institute Inc., 2000). Pearson correlations were calculated between phenotypic traits of interest (Weaver and Wuensch, 2013).

QTL Identification
A dense genetic linkage map of the YW DH population based on 4,031 presence-absence DArTseq markers corresponding to 1,366 unique loci (Zou et al., 2014) was utilized in this study. A total of 772 single nucleotide polymorphism (SNP) markers generated from the DArTseq were added to the genetic map by integrating in genetic bins (Supplementary Table 3). The present integrated map covered a genetic distance of 2,084 cM, with an average distance of 1.53 cM between loci and was subsequently used for QTL analysis. QTL associated with various traits of interest were identified by the composite interval mapping model (Wang et al., 2012) using the software WinQTL cartographer version 2.5 (http://statgen.ncsu.edu/qtlcart/WQTLCart.htm). The criteria for identification of QTL were followed as described by Shi et al. (2009). The significant QTL (P = 0.05) with overlapping suggestive QTL (P = 0.5) named as "identified" QTL. The term "consensus" QTL was used for the same genomic interval identified for the same trait detected in different environments and was integrated by meta-analysis using the software BioMercator v4.2 (Goffinet and Gerber, 2000). We designated the "major" QTL which had R 2 ≥ 20% and minor QTL which had R 2 < 19.9%, as suggested by Collard et al. (2005). QTL that could be detected in more than one experiment were classified as "reproducible" QTL, while others were classified as "non-reproducible."

The Alignment of Identified QTL with Previously Published Studies
We chose the reference genome of B. napus "Darmor-bzh" (Chalhoub et al., 2014), and the Tapidor/Ningyou (TN, B. napus) DH population as reference to compare QTL with those identified in YW DH population in this study. TN DH population has been extensively investigated previously and QTL associated with seed yield and its related traits and seed quality traits have been identified across different environments using genetic maps based on traditional makers ) and a newly constructed SNP-based genetic bin map (Luo et al., under review). DArTseq markers underlying the identified QTL regions that were identified on the C genomes of YW DH population and TN DH population were aligned to the reference genome of B. napus "Darmor-bzh" (Chalhoub et al., 2014) by Blast using an e-value threshold of e −10 (Altschul et al., 1990) with ≥62 bp match length. The details of gene annotation for "Darmor-bzh" were cited from Körber et al. (2016). When the QTL regions of YW DH population could be aligned to the candidate gene for the same trait in the C genome of B. napus, we presumed there is a homologous candidate gene in the QTL region of B. carinata. When the QTL regions of YW DH population and TN DH population showed alignment to the same physical position on the reference genomes, we assumed the homologous QTL control genetic variation for trait of interest in YW and TN populations.

Population Structure, Genetic Relatedness, and LD Analyses of B. carinata
A total of 54,510 polymorphic DArT-seq markers (38,396 presence-absence, also called silico DArTs and 16,114 SNPs) were identified in 81 accessions of B. carinata. After filtering, we selected 33,924 high quality markers that had overall call rates >90% and reproducibility >0.9, missing data of <10%, and MAF (minor allele frequency) >0.05 in the population, for the analyses of population structure and genetic diversity. The LnP(D) value decreased continuously with the change of K from 1 to 10, and the most significant change was observed when K increased from 1 to 2, ∆K also showed a peak at K = 2 (Figure 1,  Supplementary Table 4). Accordingly, the 81 accessions could be divided into two major sub-populations (Figure 2A). Both parental lines of the YW DH population were found to represent different lineage as revealed by both population structure and NJ phylogenetic tree (Figure 2).
When we aligned the marker sequences (33,924) of the diverse B. carinata lines to the published C n genome sequence of B. napus (Chalhoub et al., 2014) and the C • genome sequence of B. oleracea , respectively, much more B. carinata markers sequence (1,458) could be uniquely aligned to B. oleracea (5,126) than that to B. napus (3,668) under the same parameters. Besides, the quality of the alignment to B. oleracea were much better, 87.87% of the alignments to the B. oleracea with over 65 bp match length, which is much higher than aligned to B. napus (0.65%). To avoid the potential subgenomic differentiation of the C genome between B. carinata and B. napus, we used the aligned results against its progenitor B. oleracea, as C genome reference. As a result, a total of 10,999 markers that uniquely aligned to the genomes of B. nigra (B genome) and B. oleracea (C genome) were then used to estimate the LD pattern of the diverse lines. Our result indicated that the C genome showed a longer LD decay (400 Kb) than the B genome (250 Kb) (Figure 3, Table 1). The LD decay distance varied across 17 chromosome,  from 150 Kb (B4) to 650 Kb (C4, C6, and C9; Table 1). These results revealed significant differences in the level of LD between different chromosomes and subgenomes.

Phenotype Variation within the YW DH Population across Environments
Estimated means of parental lines, ranges and h 2 -values of traits measured in DH lines are given in Table 3. Both parental lines Y-BcDH64 and W-BcDH76 differed significantly for most traits measured across phenotypic environments ( Table 2). W-BcDH76 had higher seed yield and 1,000 seed weight and early flowering time than Y-BcDH64. Estimated means for different seed quality traits in Y-BcDH64 were higher than W-BcDH76 in most environments except for oil content and oleic acid. Erucic acid was significantly different in the contrasting environments. Y-BcDH64 had higher erucic acid in semi-winter environment, but had a lower content in spring environment ( Table 2).
A wide range of phenotypic variation was observed for all 14 traits between the parents and among the DH lines under different environments. Moderate to high h 2 was observed in the YW population for 10 traits measured in more than one experiment, ranging from 15.59% for pod width (spring) to 90.79% for linolenic acid (spring) ( Table 3). Frequency distributions for trait means clearly showed transgressive segregation for all the traits in the DH lines (Figure 4,  Supplementary Figure 2). ANOVA on the investigated traits across different environments showed that "genotype (G), " "environment (E), " and "G × E" have significant effects for almost all of the traits except for the flowering time (at Wagga Wagga, Australia; Table 3). However, significant positive correlations for the flowering time were observed in different experiments, WH09 and WH10, HZ12, and XN12, WW13, and WW14 ( Figure 5A). The correlation coefficient for the flowering time observed between semi-winter (China) and spring (Australia) experiments was higher than that between semi-winter  (China) and spring (China), and spring (China) and spring (Australia). Phenotypic correlations among seed quality traits were also calculated ( Figure 5B). The correlation between oil content and oleic acid, and oil content and linoleic acid showed the significant negative correlation, but a significant positive correlation was observed between oil content and erucic acid, as well as oil content and linolenic acid. Oleic acid showed positive correlation with linoleic acid, but significant negative correlation with linolenic acid and erucic acid. These results showed that the different fatty acid components in the seeds correlate with high erucic acid.

Identification of QTL for Seed Yield and Its Related Traits
A total of 150 identified QTL for seed yield, flowering time, pod width, pod length, seed number per pod, seed weight, pod number of main inflorescence, and length of main inflorescence were identified in the YW DH population (Supplementary Table 5). After binning QTL which were repeatedly detected from different experiments for the same traits, a total of 116 consensus QTL were identified for the eight agronomic traits. Of these, 76 consensus QTL were detected in B genome while 40 were detected in C genome. The most number of consensus QTL were identified in chromosomes B1 and B8 (13 each), while the least number of consensus QTL (1) were identified in C1 (Figure 6, Supplementary Figure 3).
Six consensus QTL for seed yield were identified (only measured in two experiments, HZ12 and XN12, because of the high pod shattering) on chromosomes B2, B4, B6, C4, and C9, explaining phenotypic variation ranging from 4.2 to 16.7%. The parent W-BcDH76 contributed favorable alleles of the QTL on B2, B4, C4, and C9, whereas parent Y-BcDH64 contributed favorable alleles of two consensus QTL for seed yield (qSY.B6-1, qSY.B6-2) located on B6 chromosome (Supplementary Table 5). None of the consensus QTL was consistently detected across experiments. But four of the six consensus QTL overlapped with the QTL accounting for other yield related traits. For instance, qSY.B4-1 and qSY.C9-1 for seed yield were overlapped with the QTL for seed number per pod (qSN.B4-1, qSN.C9-1), and qSY.B6-1 was overlapped with the QTL for seed weight (qSW.B6-1) (Supplementary Figure 3).  For flowering time, a total of 28 consensus QTL were identified on 11 chromosomes in three macro-environments (Australian spring, Chinese spring, and Chinese semi-winter; Figure 7). The identified QTL (qFT35, C6) integrated in the consensus QTL could explain up to 59.40% of the phenotypic variation in flowering time (Supplementary Table 5). A total of three consensus QTL, all of which located in C6, were identified as "major QTL" individually explaining phenotypic variation over 20% (Table 4). For instance, the consensus QTL, qFT.C6-3 on chromosome C6 within 1 cM interval (genetic position from 35.5 to 36.5 cM) explained the maximum phenotypic variation (from 28.0 to 59.4%). Of the three major QTL, qFT.C6-1 was identified in the semi-winter environment of China with high R 2 (27.13% in experiment WH09 and 42.36% in experiment WH10), but low R 2 (2.35% in experiment XN12) identified in the spring environment of China. Among the 28 total QTL for flowering time, seven of them (qFT.B4-1, qFT.B4-2, qFT.B4-3, qFT.C6-1, qFT.C6-3, qFT.C8-1, and qFT.C8-2) were identified in at least two macro-environments (Supplementary Table 5). Two of the seven QTL (qFT.B4-1 and qFT.B4-2) were identified in all the three environments, and both were located in B4 chromosome. Four of the seven consensus QTL, qFT.B4-3, qFT.C8-2, and qFT.C6-1, qFT.C8-1 were identified in the two spring macroenvironments (Australian spring and Chinese spring) and two Chinese macro-environments (Chinese semi-winter and Chinese spring), respectively. While, the last one of the seven consensus QTL, qFT.C6-3 was identified in both of the Chinese semi-winter environment and Australian spring environment. These results indicated the differential genetic effects of the QTL in response to environment.

Identification of QTL for Seed Quality Traits
A total of 296 identified QTL were detected for protein content, oil content, erucic acid, linolenic acid, linoleic acid and oleic acid in the seeds of the YW DH population (Supplementary Table 5). These identified QTL could be integrated as 166 consensus QTL and localized across the 17 chromosomes, explaining phenotypic variation varying from 0.2 to 50.4%. Four of these QTL, located in C4, explained ≥20% of the phenotypic variation and therefore were designated as major QTL (Table 4). Consistent with the distribution of QTL for seed yield and yield related traits in subgenomes, more QTL were identified in the B genome (92) compared with that in the C genome (74) for seed quality traits (Figure 6). In chromosome B8, the most number of consensus QTL (24) were detected for seed quality traits, whereas the least (1) were identified in chromosome C7.
Among the six seed quality traits, the maximum number of consensus QTL were identified for oil content (34) and the minimum were identified for protein content (20). The QTL for oil content were detected on all chromosomes of B. carinata genome except B9 and C7 (Supplementary Table 5). The R 2 explained by these QTL ranged from 2.67% (qOC.C5-1 on C5) to 10.4-23.4% (qOC.C4-1 on C4). Nine consensus QTL for oil content were repeatedly detected in at least two experiments. The alleles from the parent W-BcDH76, at the QTL located in chromosome B5, C3, C4, C5, C6, C8, and C9, contributed to the increase of oil content in YWDH population.
Using the phenotypic data of six seed quality traits investigated for the 81 diverse lines in two experiments, we also identified the markers accounting for genetic variation in seed quality traits by GWAS and compared with the results revealed by linkage analyses. A total of 12,030 DArT-seq markers with known physical position were used for GWAS. According to the QQ plots, the suitable model for each trait was selected. Seven markers were found to be significantly associated with five traits ( Table 5). Linoleic acid showed a very limited phenotypic variation among diverse lines. As a result, we could not detect any association for this trait. One of the seven markers (5121285) was significantly associated for three traits (PRO-WH13, OC-WH13, and ERU-WH13). Two markers 5868483-1S and 5121285 showed significant association with linolenic acid (LEN-WH14) and erucic acid (ERU-WH13), respectively. Both these markers also located in the QTL region for the same traits in the YW DH population (  Whereas, the other QTL for seed weight (qSW.B5-2) was overlapped with the one for pod width (qPW.B5-1) in B5 chromosome. In C4 chromosome, there were five overlapping QTL, qPRO.C4-1, qLEI.C4-2, qOLE.C4-2, qLEN.C4-1, and qOC.C4-1 for protein content, linoleic acid, oleic acid, linolenic acid, and oil content, respectively (Supplementary Figure 3, Supplementary Table 5).
To determine whether the homologous QTL genomic regions of B. carinata controlling variation for yield related traits and seed quality traits present in the C genome could also modulate genetic variation for these traits in the other oilseed Brassica crop containing C genome, we compared and aligned the QTL regions of YW DH population to the physical position of those QTL regions of TN DH population and candidate genes predicted in B. napus (the reference genome Darmor-bzh). The QTL accounting for flowering time, erucic acid and oleic acid of B. carinata could be aligned to the candidate genes of the B. napus for the three traits (Figure 8).  Figure 8B). The QTL accounting for oleic acid in C3 (qOLE25) and C8 (qOLE47, qOLE48, and qOLE49) of the YW DH population were aligned to the gene FAR7 gene in C3 and C8 of oleic acid in B. napus (Körber et al., 2016), respectively.
Additionally, a total of 17 QTL detected in the C genome of the YW DH population could also be aligned to the homologous QTL region for the same traits in the C genome of B. napus (Supplementary Table 5). For example, the identified QTL (qFT34, qFT35, and qFT36) for flowering time was aligned to the homologous QTL region in the TN DH population (Luo et al., under review). Three QTL (qSW26, qSW27, and qSW31) for seed weight and one QTL (qPRO28) for protein content were also aligned to the homologous QTL region of TN DH population. For the erucic acid, ten of the identified QTL (qERU37, qERU49 to qERU56, qERU59) detected in C3, C6, and C8 chromosomes could be aligned to the homologous QTL region in the C genome of B. napus.

DISCUSSION
Development of high yielding varieties of B. carinata with canola quality oil/biodiesel properties is one of the important goals of some Brassica improvement programs. To utilize genetic diversity present in B. carinata accessions originated from Ethiopia and improve this species itself, it was important to evaluate the extent of genetic diversity and to identify the QTL associated with the important agronomic and seed quality traits.
Assessment of genetic diversity in any species is crucial for genetic improvement programs. This has been accomplished using morphological, biochemical (isozymes) and molecular markers in various crops. In B. carinata, different marker systems such as AFLP, and SSR markers have been utilized to analyse genetic diversity (Warwick et al., 2006;Jiang et al., 2007). In this study, we utilized high quality 33,924 DArTseq markers to analyse the genetic diversity. Our results on population structure revealed that 81 accessions could be grouped into two sub-populations (Figure 1). These results suggested that a limited genetic diversity exist in B. carinata accessions originated from different regions of Ethiopia-the center of genetic diversity of this species.
We found that the LD decay in B. carinata varied across chromosomes, consistent with previous findings in B. napus (Qian et al., 2014;Liu et al., 2016) and B. oleracea (Pelc et al., 2015). The LD in the C subgenome (0.17-0.65 Mb) decayed longer than that of the B subgenome (0.15-0.35 Mb) of B. carinata. A previous study in B. napus has shown that the LD decay of the C subgenome was also longer than that of the A subgenome . Long LD decay in the C subgenome of B. napus and B. carinata may be due to the high level of gene conservation which could have resulted from limited recombination or due to large segmental structural variation. However, the distance of LD decay (1.12-8.50 Mb) in the C subgenome of B. napus presented previously (Qian et al., 2014;Liu et al., 2016) seems to be much longer than that of the C subgenome of B. carinata (0.17-0.65 Mb) investigated in this study, and the size of the block with strong linkage disequilibrium in C genome also seems much longer in B. napus. But the distance of LD decay (0.6 Mb) in C subgenome of B. oleracea (Pelc et al., 2015) seems to be similar to B. carinata. Additionally, when we aligned the marker sequences (33,924) of the B. carinata diverse lines to the published C genome sequence of B. napus (Chalhoub et al., 2014) and the C genome sequence of B. oleracea , significantly more B. carinata markers sequence could be uniquely aligned to B. oleracea (5,126) than that to B. napus (3,668) under the same parameters. These results may indicate that the C subgenome of B. carinata has substantially differentiated from the C subgenome of B. napus, and thus has great potential to broaden the genetic diversity of the C subgenome of B. napus.
A wide range of phenotypic variation was observed for all 14 traits among the DH lines under different environments and showed significant environment interactions (Figure 4,  Supplementary Figure 2). The flowering time was shorter in spring environment than that in semi-winter environment as expected. The seed weight of lines grown in spring environment was heavier than that presented in semi-winter environment. The seed quality traits were also significantly affected by environments. For example, a higher oil content and erucic acid were observed in spring environment than that in semiwinter environment. This may have resulted from the springtype habit of B. carinata which is more adapted under spring type environment. Due to the late maturity in semi-winter FIGURE 6 | The number of the consensus QTL across the 17 linkage groups of YW DH population. The boxes with diagonal lines and cross lines represent the QTL number for seed yield with its related traits and seed quality traits, respectively. environment, the high temperature may adversely affect seed yield and quality attributes. Therefore, shortening the flowering time of B. carinata to adapt to the semi-winter environment would increase the seed yield and improve the seed quality of this species. In the YW DH population (high erucic acid), the correlations among the oil content and its related fatty acid content was different with that observed in the TN DH population of B. napus (low erucic acid). A previous study showed that the oil content show no correlation with erucic acid and oleic acid (Smooker et al., 2011), while another research found that oil content showed negative correlation with erucic acid but positive correlation with oleic acid in B. napus (Zhao et al., 2008). This resulted indicated that the content of different fatty acid components correlated significantly but varied in different genetic background, especially showing different correlations when the lines contains low erucic acid and high erucic acid.
One of the major objectives of this study was to detect QTL for important agronomic and seed quality traits of B. carinata. The QTL we identified here would provide insights for understanding the genetic basis of these traits and further genomic-based genetic improvement of this species, which would be also reference for the comparison with other Brassica species. A total of 282 consensus QTL with an average confidence intervals of 6.0 cM, were detected for the investigated traits of the YW DH population, including nine consensus QTL stably detected with major genetic effects for multiple traits (Supplementary Table 5). These major QTL detected stably in different environments may play a key role for the traits being adapted to different environments (Collard et al., 2005). Some QTL identified in the YW DH population were validated in the diverse lines (Table 5).
Additionally, new markers associated with the seed quality traits different with that in the YW DH population were also identified. In the YW DH population, the consensus QTL detected in B genome were much more than that in the C genome (Figure 6). This suggested that the allelic variation accounting for the traits in the B subgenome was more than that of the C subgenome between the two parents of the YW DH population. On the other hand, this may also be related to the richer genetic diversity or historical recombination events accumulated in the B genome than that in the C genome of this species, which is indicated by the longer LD decay in the C subgenome (Figure 3).
By physical alignment and comparison of the QTL identified in the YW DH population and diverse lines, we could find homologous QTL accounting for flowering time, erucic acid, and oleic acid traits to the candidate genes and QTL of B. napus in the C genome (Figure 8). However, these QTL may have undergone functional differentiation with different genetic effects to the traits and environments between the two species due to different adaptation, domestication and cultivation. For instance, the major QTL for erucic acid in the YW DH population was located in chromosome C4 (identified QTL qERU39, R 2 = 24.53% and qERU44, R 2 = 32.60%, integrated in the consensus qERU.C4-1), but the QTL aligned to the major gene in C3 of B. napus only presented minor effects in the YW DH population. According to the canola breeding history, the low erucic acid mainly resulted from the mutation of the genes located in A8 and C3 (Harvey and Downey, 1963;Harper et al., 2012). However, in B. carinata without the breeding process for low erucic acid such as in the YW DH population, the major QTL was different with that in canola B. napus. Therefore, introducing the segment with low erucic acid alleles from the C genome of B. napus would    (Yang et al., 2016) and B. napus (Darmor-bzh), respectively.
Frontiers in Plant Science | www.frontiersin.org be useful for improvement on the seed quality of B. carinata.
For flowering time, one of the major QTL aligned to the FT gene in C6 of B. napus could be detected in both of the semiwinter and spring environments in the YW DH population, while this gene may mostly express in spring environment in B. napus (Wang et al., 2009). In the near future, with accumulated QTL information from different populations of B. carianta and the available genome sequence of this species, we may understand more deeply on the allelic variation within the species and subgenomic variation between the species.

AUTHOR CONTRIBUTIONS
WZ performed the research, analyzed the data and wrote the manuscript; DH and XS analyzed the data; SG provided the phenotypes in the semi-winter environments; HR and RR performed the experiment in Australia, and analyzed the data; ZW helped the field experiments and phenotyping; JM provided the plant materials and suggestions; JZ designed the research and analyzed the data. All authors revised, read, and approved the final manuscript.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017. 00615/full#supplementary-material the average temperature each month during the growing season in different environments; (B) represents the rainfall each month during the growing season.
Supplementary Figure 2 | The frequency distribution of the phenotypic variation for seven traits of the YW DH population. Four seed yield related traits and three seed quality traits are showed in this figure, while the other seven investigated traits are shown in Figure 4. P1 and P2 represents the two parents Y-BcDH64 and W-BcDH76, respectively.