A High-Density Genetic Linkage Map for Cucumber (Cucumis sativus L.): Based on Specific Length Amplified Fragment (SLAF) Sequencing and QTL Analysis of Fruit Traits in Cucumber

High-density genetic linkage map plays an important role in genome assembly and quantitative trait loci (QTL) fine mapping. Since the coming of next-generation sequencing, makes the structure of high-density linkage maps much more convenient and practical, which simplifies SNP discovery and high-throughput genotyping. In this research, a high-density linkage map of cucumber was structured using specific length amplified fragment sequencing, using 153 F2 populations of S1000 × S1002. The high-density genetic map composed 3,057 SLAFs, including 4,475 SNP markers on seven chromosomes, and spanned 1061.19 cM. The average genetic distance is 0.35 cM. Based on this high-density genome map, QTL analysis was performed on two cucumber fruit traits, fruit length and fruit diameter. There are 15 QTLs for the two fruit traits were detected.


INTRODUCTION
Cucumber (Cucumis sativus L.), a diploid species (2n = 14), which belongs to the family of Cucurbits. The cultivated area of cucumber is second only to tomato in the world. Cucumber has seven chromosomes and 100s of known functional genes (Xie and Wehner, 2001), and the genome size is only 367 Mb, which is smaller than other Cucurbits commercial crops (Ren et al., 2009). Because of its economic importance and nutritional value, the cucumber has been the research model plant of cucurbitaceae plants, and the genetics and molecular biology study of cucumber has progressed rapidly in the latest 20 years. Recent years, the whole genome of cucumber varieties 9930 ('Chinese long' inbred line; Huang S. et al., 2009), 'Gy14' (North American pickling type; Cavangnaro et al., 2010), and 'B10' (European inbred line; Wóycicki et al., 2011) have been sequenced. Cucumber has become model plant for Cucurbits' genetic research.
Currently, genetic linkage maps commonly include restriction fragment length polymorphisms (RFLP), random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), sequence characterized amplified region (SCAR), simple sequence repeat (SSR), and single-nucleotide polymorphism (SNPs) markers. A genetic map, especially a high-density genetic map provides an important foundation for quantitative trait loci (QTL) mapping (Esteras et al., 2012;Petroli et al., 2012;Song et al., 2012) and anchoring sequence scaffolds (Wang et al., 2011;Ren et al., 2012), and the utility of genetic linkage maps depends on the types and numbers of markers used. Since 1994, several kinds of markers have been used to assess the genetic diversity of cucumber accessions including RAPD, AFLP, SCAR, SRAP, SSR (Dijkhuizen et al., 1994;Kennard and Havey, 1995;Serquen et al., 1997;Park et al., 2000;Bradeen et al., 2001;Fazio et al., 2003;Li et al., 2005;Wang et al., 2005;Sun et al., 2006;Yuan et al., 2008a,b;Ren et al., 2009;Zhang et al., 2012), but no SNP markers were available for cucumber. With the rapid development of the next-generation sequencing (NGS), makes it possible to obtain thousands of SNPs throughout the cucumber genome and construct the high-density SNP genetic maps. There are several methods have been used to discover SNPs markers, for example RAD-seq (restriction site-associated sequencing; Miller et al., 2007), double digest RAD-seq (Peterson et al., 2012) and GBS (two-enzyme genotyping-by-sequencing; Poland et al., 2012). Sun et al. (2013) developed a new technique SLAF-seq (specific length amplified fragment sequencing), and discovered largescale de novo SNP and genotyping by this method. The efficiency of this method was tested on rice and soybean, in addition, SLAFseq was used to create a high-density genetic map for common carp (Cyprinus carpio L.). Through the method of SLAF-seq, SNPs marker has been wildly applied for high-density genetic map construction in several crop and animal species, such as sesame , soybean (Qi et al., 2014), common carp . And it is possible to construct a highdensity genetic map based on SNPs markers of cucumber. SNPs markers may provide a more saturated genetic linkage map of cucumber. Wei et al. (2014) constructed the first SNP genetic map of cucumber by SLAF-seq, which contained 1,800 SNPs, the average distance between adjacent markers was 0.50 cM. And Xu et al. (2015b) constructed a SNP map of cucumber by SLAF-seq, which contained 1,892 SLAFs with the total length is 845.87 cM and average distance is 0.45 cM. After then, Xu et al. (2015a) constructed anther SNP map using 949 F 2 populations, on this basis, QTL analysis on fruit flesh thickness, and find the candidate genes.
There are a lot of important QTLs and horticultural traits associations have been researched in cucumber. Which include sexual expression, lateral branch, disease resistance, parthenocarpy, fruit shape, formation of bisexual flowers, fruit warty and fruit flesh thickness (Kennard and Havey, 1995;Serquen et al., 1997;Dijkuizen and Staub, 2002;Fazio et al., 2003;Pan et al., 2005;Sakata et al., 2005;Sun et al., 2006;Li et al., 2009;Yang et al., 2014;Xu et al., 2015a). And some of these research results have been successfully used in cucumber marker-assisted selection breeding.
In our study, genotype data was generated and SNPs markers were discovered by SLAF-seq for cucumber. Using these SNP markers, we constructed a high-density genetic map of cucumber. This high-density genetic linkage map might have utility in locating QTL associated with economically important traits of cucumber. And we used the new genetic map to define QTLs for several fruit traits of cucumber.

Ethical Standards
The authors declare that this study complies with the current laws of the countries in which the experiments were performed.

Plant Materials
Cucumber gynoecious line S1000 was crossed with the monoecious line S1002. The S1000 plants have little leaves and normally produce short fruit (10-15 cm), the diameter size is about 55-60 mm. In contrast, the plants of S1002 have large leaves and longer fruit (40-45 cm), the diameter size is about 30-35 mm. Parent materials provided by Huang Sanwen. The two parental lines S1000 and S1002 were planted in greenhouse in the spring of 2012, the F 1 of S1000 × S1002 were grown in August of 2012, and the F 2 population were planted in the spring of 2013 and self-pollinated to obtain 150 F 2:3 families.

DNA Extraction
Young healthy leaves from the two parents and the 153 F 2 plants were compiled and DNA was extracted by the method of CTAB (Doyle and Doyle, 1987). DNA was quantified with an ND-2000 spectrophotometer (NanoDrop, Wilmington, DE, USA) and by electrophoresis in 0.8% agarose gels with a lambda DNA standard.

Phenotypic Data Collection and Genotyping
Phenotypic data were collected in three environments over 3 years (2013 autumn, 2014 spring, and 2015 spring) with F 3 families, S1000, S1002 and their F 1 were included in all experiments. F 3 Families were arranged in a randomized complete block design two replications respectively in 2013 autumn, 2014 spring and 2015 spring with 10 plants in each replication. We collected the data of fruit length (fl, cm, length from the apex of fruit to the pedicel attachment), fruit diameter (fd, mm, at the maximum fruit width). The fruit-related traits were measured according to the standards published by Yuan et al. (2008a). We measured the numbers of fl and fd on individual plants, and averaged within each F 3 family. Considering that there might be some diseases and other factors, 12 plants were randomly selected from the F 3 families and planted, and three fruits were measured of each plant. Phenotypic data from 10 healthy plants were used to represent each F 2 individual.

SLAF Library Construction and High-throughput Sequencing
The method of SLAF-seq was used to genotype 153 F 2 plants, and the two parents, as above described . Genome DNA was digested to completion with RsaI+Hpy166II. A singlenucleotide overhang were added to the digested fragments with Klenow Fragment (3 −→5 exo-; NEB) and dATP at 37 • C, and then the Duplex Tag-labeled Sequencing adapters (PAGE purified, Life Technologies) were ligated to the A-tailed DNA with T4 DNA ligase. The PCR reaction was performed using diluted restriction-ligation samples, dNTP, Q5 R High-Fidelity DNA Polymerase and PCR primers: AATGATACGGCGACCA CCGA and CAAGCAGAAGACGGCATACG (PAGE purified, Life Technologies). The PCR productions were purified using Agencourt AMPure XP beads (Beckman Coulter, High Wycombe, UK) and pooled. The pooled sample was separated by electrophoresis in a 2% agarose gel. Fragments with 244∼344 bp (with indexes and adaptors) in size were excised, purified using QIAquick Gel Extraction Kit (QIAGEN). The gelpurified product was sequenced on the Illumina HiSeq 2500 system (Illumina, Inc., San Diego, CA, USA) according to the manufacturer's recommendations.

SLAF-seq Data Analysis and Genotyping
SLAF-seq data was operated using the software developed by Sun et al. (2013), and the genotyping methods with reference to Sun et al. (2013) and Wei et al. (2014). According to sequence similarity, the generated pair-end reads from SLAF-seq were clustered, and the reads could be inferred from one to one alignment by BLAT (-tileSize = 10 -stepSize = 5). Identical reads were merged, and the reads with over 90% similarity sequences were grouped into one SLAF locus . In each SLAF locus, minor allele frequency (MAF) evaluation was used to define alleles.
In order to ensure the quality of genetic map, according to the following rules to filter SLAFs: (1) SLAFs with parents sequence depth of less than 10×; (2) SLAFs with complete degree below 30%; (3) SLAFs with serious partial separation (p-value < 0.05); (4) Heterozygous SLAFs in two parents. One SLAF locus can contain no more than four SLAF tags in the mapping populations of cucumber, according to this principle, the groups which contain over four tags were considered as repetitive SLAFs and excluded. Polymorphic SLAFs which contained 2-4 tags were considered as potential markers. Those polymorphic SLAF markers were then assorted into eight segregation patterns as following: ab × cd, ef × eg, hk × hk, lm × ll, nn × np, aa × bb, ab × cc, and cc × ab. Since the F 2 mapping populations were derived from two homozygous cucumber inbred lines with a genotype of aa or bb, therefore only the SLAF markers which had segregation patterns of aa × bb were used in map construction.

Linkage Map Construct and QTL Analysis
Based on the locations of SLAF markers on chromosomes, they were assigned into seven linkage groups (LGs). The NGS data may exist some genotyping errors and deletion, which can reduce the quality of the genetic map. These errors were corrected by the High Map Strategy . All the genotype data from the F 2 mapping population was used to perform linkage analysis using JoinMap 4.0 software (Van Ooijen, 2006). The SLAFs can be divided into seven LGs with the position of the reference genome. By computing the MLOD value between two adjacent markers (Vision et al., 2000), to filter out the SLAFs with the MLOD values are less than 5. Using Marker HighMap  software to analysis the linear array of markers in each LG, and estimate the genetic distances of between two adjacent markers. The method of Internal Mapping and the software of R/qtl were used in QTLs analysis, the confidence intervals of QTLs were calculated according 95% Bayes credible interval method (Sen and Churchill, 2001) with R/qtl. The process of the genetic map construction was shown in Figure 1  .

Analysis of SLAF-sqe Data and SLAF Markers
After DNA high-throughput sequencing, generated a total of 22.98 Gb of raw data, containing 143.63 M reads of ∼80 bp in length left over after preprocessing. The Q30 (it is a quality score of 30 which indicating a 0.1% chance of an error and 99.9% confidence) ratio was 81.35% and guanine-cytosine (GC) content was 39.04%. There were 12,417,012 reads from female parent and 8,916,362 reads from male parent, the average number of reads from F 2 population is 799,335.
The numbers of SLAFs in the male and female parent were 96,617 and 107,488, respectively. The average sequencing depths of male and female parent were 29.05-and 39.95-fold, respectively. In F 2 population, the average number of SLAFs was 80,403, and the coverage ranged with an average of 3.45-fold ( Table 1).
Among the 115,789 high-quality SLAFs detected, 15,946 were polymorphic with a rate of 13.77% ( Table 2). Of the 15,946 polymorphic SLAFs, 15,196 were classified into eight segregation patterns (Figure 2). F 2 population was obtained by selfing the F 1 of a cross between two parents with homozygous genotype of aa or bb. So, only the F 2 plants with aa × bb segregation pattern were  used to construct the high-density genetic linkage map, and there are totally 14,712 markers fell into this type. Among these 14,712 markers, 3,077 markers were used for the high-density genetic map construction, which are homozygous in the two parents, with sequence depth more than 10-fold, and over 70% integrity of SLAF tags.

Basic Characteristics of the Genetic Map
After linkage analysis, 3,057 of the 3,077 SLAF markers were mapped on the genetic map, while the other 20 SLAFs were not link to any group. These markers with 52.60-fold sequence depth in S1002 (the male parent ), and 76.58-flod in S1000 (the female parent ), and 4.97-fold in each F 2 population on average. The final genetic map included 3,057 markers on the seven LGs (Table 3 and Supplementary Figure S1), and was 1,061.19 cM in length with an inter-marker distance of 0.35 cM ( Table 3). The largest LG was LG3 with 649 markers, a length of 212.14 cM, and an average distance of only 0.33 cM between adjacent markers. The smallest LG is LG7 which only has 336 markers, with the length of 100.69 cM and an average distance of 0.30 cM. The largest gap on this map was 2.36 cM located in LG5. And this genetic map included 4,475 SNPs (Table 3,  Supplementary Tables S1-S3).
Until now, this map might be the most density SNP genetic linkage map to data for cucumber.

QTL Analysis Using the High-Density Genetic Map
Phenotypic data of two parents, F 1 , F 2 , and F 2:3 families are presented in Figure 3. QTLs for two fruit traits were detected in F 3 families (Table 4; Figures 4 and 5), and mapped to unique positions. QTLs were detected for all fruit traits. A total of 15 QTLs were detected with seven and eight QTLs indentified per Frontiers in Plant Science | www.frontiersin.org FIGURE 3 | Phenotypic evaluation of fruit length and fruit diameter for S1000, S1002, and F 3 . The Y axis is the number of plant individuals. The X axis is continuous, for the figure fruit length, it means the fruit length (fl) ≤ 15 cm, 15 cm < fl ≤ 18 cm, 18 cm < fl ≤ 21 cm, 21 cm < fl ≤ 24 cm, 24 cm < fl ≤ 27 cm, 27 cm < fl ≤ 30 cm, 30 cm < fl ≤ 33 cm, fl > 33 cm; for the fruit diameter, it means the fruit diameter (fd) ≤ 38 mm, 38 mm < fd ≤ 41 mm, 41 mm < fd ≤ 44 mm, 44 mm < fd ≤ 47 mm, 47 mm < fd ≤ 50 mm, 50 mm < fd ≤ 53 mm, 53 mm < fd ≤ 56 mm, 56 mm < fd ≤ 59 mm, fd > 59.   trait. The proportion of phenotypic variance explained by a single QTL (r 2 ) ranged from 5.7 to 13.6% and LOD scores from 2.05 to 5.49. Seven QTLs were detected for fruit length, which were localized on chromosomes 3, 4, 6 and 7, accounting for 7.6-13.6% of the phenotypic variation. There were eight QTLs were detected for fruit diameter, which were localized on chromosome 1, 3, 5, 6 and 7, and the phenotypic variation was from 5.7 to 13.3%. (Table 4; Figures 4 and 5).

DISCUSSION
SLAF-seq is not a new but also a highly automated technique, it was developed based on the high-throughput sequencing. Because of the SLAF-seq was measured by sequencing the pairedends of the sequence-specific restriction fragment length, which makes the repeatability of it is better than other techniques (such as RAD-seq, double digest RAD-seq and GBS). The SLAF-seq method provided significant advantages making it is very suitable for analysis of cucumber which with low polymorphism, such as the development of large numbers of markers having high accuracy with less sequencing. Usually, those conventional methods of developing markers were inefficient, expensive, and time-consuming, (Kennedy et al., 2003;Xie et al., 2010), while SLAF-seq has higher density, uniformity, and efficiency. SLAF-seq has been used in several studies since it was developed, such as Zhang et al. (2013) using this method constructed the first sesame high-density SNP genetic linkage map, Huang S. et al. (2013) obtained the draft of kiwifruit Actinidia chinensi genome, Chen et al. (2013) researched the development of 7E chromosome-specific molecular markers for Thinopyrum elongatum, Wei et al. (2014) and Xu et al. (2015a,b) constructed the high-density SNP genetic map for cucumber, respectively.
In this research, we developed thousands SNP markers for cucumber. We obtained 22.98 Gbp raw data based on highthroughput sequencing, consisting of 143.63 M reads, including 115,789 high-quality SLAF markers with a polymorphism rate of 12.71%. The obtained markers covered seven cucumber chromosomes, which had from 1,343 to 3,545 polymorphic SLAF markers on each chromosome, and Aa total of 15,946 polymorphic SLAF markers were used to construct the highdensity SNP map. The integrity and accuracy of markers were much higher and the quality and quantity of markers met the requirements for construction of a high-density genetic linkage map. Therefore, the technique of SLAF-seq is suitable for discovering plant chromosome-specific molecular markers with higher success rates, specificity, and stability at low cost.
Compared with the other two SNP maps of cucumber (Wei et al., 2014;Xu et al., 2015a), this genetic map reported in this paper is the highest density map and had the smallest average distance (0.35 cM) between adjacent markers for cucumber. The map spans 1,061.19 cM with an average number of 436.71 markers per LG with an average distance of 0.35 cM between adjacent markers. This map had the most number of markers (3,057) and the minimum average distance (0.35 cM) in all of the cucumber genetic maps ( Table 5). These markers could be used for different populations. More important, the SNP markers on this map are the most abundant type of genetic variation between different individuals. SNPs are very suitable and favorable for the studies of comparative genomic (Luo et al., 2009) and association mapping (Cogan et al., 2006;Chan et al., 2010).
This map not only provides large scale SNP markers for cucumber, but also provides useful data for cucumber QTL analysis, gene fine mapping, map-based gene clone and molecular breeding. For this genetic SNP map, all the seven LGs were structured based on the level of whole genome molecular markers, so, it could be served as a reference data for positioning sequence scaffolds on the physical map of cucumber. Fruit is the most important commodity character of cucumber (Marcelis, 1992). So high fruit yield and quality always are the most important objective for cucumber breeding. Using F 3 family plants and BC 1 population Kennard and Havey (1995) obtained the QTLs for several important cucumber fruit quality traits, which including fruit length, fruit diameter, seed-cavity diameter, fruit skin color, the ratio of fruit length/diameter, and seed-cavity/fruit diameter ratio. Subsequently, Serquen et al. (1997, using F 3 families); Dijkuizen and Staub (2002) using S 3 and BC families obtained the QTLs for fruit number, fruit weight, fruit length, fruit diameter, and fruit length/diameter ratio, and Fazio et al. (2003) identified additional QTLs for the same cucumber fruit traits. In the previous studies in our group, Yuan et al. (2008b) identified 38 QTLs for eight yield and quality components (fruit length, weight, pedicel length, flesh thickness, seed-cavity diameter, stalk length, the ratio of fruit length/diameter and fruit length/stalk) using F 2 and F 3 populations. Weng et al. (2015) identified three QTLs for fruit length which were localized on chromosomes 3, 4, and 6; and three QTLs for fruit diameter localized on chromosomes 2, 5, and 6. Zhou et al. (2015) detected two QTLs for fruit length which were localized on chromosomes 5 and 7.
We identified 15 QTLs for fruit length and fruit diameter in this study. There were seven QTLs for fruit length. In these seven QTLs, there are two couples of repeat QTLs, fl7.1 in 2013 autumn and fl7.1 in 2014 spring, fl4.1 in 2014 spring and fl4.1 in 2015 spring. We search the genes in the repeat regions on the web of http://www.icugi.org/cgi-bin/ICuGI/index.cgi, and we found several genes which were related to auxin response and multi cellular development. Eight QTLs were detected for fruit diameter, including two repeat QTLs fd6.1 in 2013 autumn and fd6.1 in 2014 spring. In this region there are several genes which were related to plant hormone response, many cells, and cell mitosis. So we suspect that the genes associated with fruit length and diameter may be existed in these areas.
In our study, the LOD value varies among most of the QTLs identified for fruit length and diameter of cucumber between 2.0 and 3.5. The LOD value is low to ascertain the accuracy of QTLs identified. In our opinion, there are several reasons for this phenomenon. First of all, the character data might exist a certain error, which is one of the important factors which affecting LOD values. The other reason of the low LOD value might be the different analysis software. In our study, we used the R/qtl to analysis the QTL, the reference coefficient is different from the software MapQTL, which might cause low LOD value of the QTL analysis results.
In this study, none of the QTLs for fruit length and fruit diameter are common across 2013 autumn, 2014 spring and 2015 spring seasons. We speculated that one of the most important reasons may be the fruit length and diameter are easily influenced by environment. Statistical characters results showed that, the results are relatively similar in 2014 spring and 2015 spring, which are different from 2013 autumn. In the process of cucumber growth, the temperature and humidity of 2013 autumn were much higher than that of 2014 spring and 2015 spring. Under the environment of high temperature and high humidity, plant diseases and insect pests such as powdery mildew and root rot occurred frequently. Which leads the QTL analysis also appeared similar phenomenon: in 2014 spring and 2015 spring, there are common QTLs (both fruit length and fruit diameter). The deviation when we collected phenotypic data might be another reason which caused there no common QTL in three seasons. In addition, the detected QTLs are less in each season, this is probably one of the reasons for no common QTL in three seasons.
In our study, some of the 15 QTLs for fruit length and fruit diameter are consistent with the previous reported, such as fl4.1 is the same with the main effect QTL in Yuan's result, and the fd5.1 are the same with the QTL in Yuan's report (Yuan et al., 2008a). But some are different from the QTLs for the two fruit traits in previous reports. We suspected that there are two reasons for this result. First of all, the gene of the same trait may be different in different materials; Second, under the influence of environment, climate and other conditions, the fruit characters are different in different season, which caused the different positioning results.

CONCLUSION
We obtained a high-density SNP genetic linkage map for cucumber. We constructed this map by a F 2 family plants, and polymorphic markers developed by using the method of SLAFseq. We generated a total of 22.98 Gb of raw data, containing 143.63 M reads of ∼80 bp in length left over after preprocessing. The final genetic map included 3,057 markers on the seven LGs, and was 1,061.19 cM in length with an inter-marker distance of 0.35 cM. Based on this high-density genetic map, QTL analysis on fruit length and fruit diameter, seven QTLs for fruit length and eight QTLs for fruit diameter were detected. The results of this study will not only provide a platform for gene/QTL fine mapping, map-based gene isolation, and molecular breeding for cucumber, but also provide a reference to help position sequence scaffolds on the physical map and assist in the process of assembling the cucumber genome sequence.

AUTHOR CONTRIBUTIONS
W-YZ constructed the mapping populations, surveyed fruit traits, performed genetic analysis, marker development, and wrote the paper. LH performed genome sequencing, constructed the map and mapping analysis. J-TY constructed the mapping populations. LC, M-LQ, and D-QY performed some of the field work and assisted with extracted DNA. H-LL and H-LH provided valuable research ideas. J-SP and RC designed and supervised the study.

ACKNOWLEDGMENTS
We thank Sanwen Huang provide cucumber parent materials, as well as cucumber genome data. This research was partially supported by National Program on Key Basic Research Project of China (2012CB113900); Shanghai Municipal Committee of Science and Technology (Grant No. 13JC1403600); China Innovative Research Team, Ministry of Education, Shanghai Graduate Education and Innovation Program (Horticulture); Shanghai Municipal Agricultural Commission Project (Hu Nong Ke ZhongZi-2015-6); Agri-X Project of Shanghai Jiao Tong University (Agri-X2015002).