High-Density Genetic Map Construction and Gene Mapping of Basal Branching Habit and Flowers per Leaf Axil in Sesame

A good genetic map can provide the framework for quantitative trait loci (QTL) analysis, map-based gene cloning, and genome sequence assembling. The main objectives of this study were to develop a high-density genetic linkage map using specific length amplified fragment sequencing (SLAF-seq) in sesame. In the result, a high-resolution genetic map with 9,378 SLAF markers and 13 linkage groups (LGs) was constructed. The map spanned a total genetic distance of 1,974.23 cM, and the mean LG length was 151.86 cM, with an average genetic distance of 0.22 cM between adjacent markers. Based on the newly constructed genetic map, genes for basal branching habit (SiBH) and flowers per leaf axil (SiFA) were mapped to LG5 and LG11, respectively.


INTRODUCTION
Sesame (Sesamum indicum L. 2n = 26) belongs to the Pedaliaceae family and is one of the most ancient domestic oil crops (Bedigian and Harlan, 1986). Sesame seed contains about 55% oil, 20% protein, and 13.5% carbohydrates, and is traditionally considered as 'the queen of oilseeds' for its high content of unsaturated fatty acids (43% oleic acid, 35% linoleic acid) and rich natural antioxidants such as sesamin and sesamolin (Ashri, 1998;Namiki, 2007). However, its production and extension was limited due to low yields, seed shattering, and high manual labor requirements. At present, sesame is mainly grown in tropical and subtropical regions of Asia, Africa, and Latin America. From 2010 to 2014, the annual acreage of sesame harvested worldwide is about 8.10 × 10 6 hectares with a total production of 4.67 × 10 6 metric tons (FAO data), most of which are grown in developing and undeveloped countries, and almost all of them are harvested manually. There is an urgent need to develop new cultivars with high yield potential, resistance/tolerance to biotic and abiotic stresses, and enhanced agronomic traits adapted to mechanized harvest (Uzun et al., 2003;Ashri, 2007;El-Bramawy et al., 2008;Uzun and Cagirgan, 2009).
Branching habit is one of the most important components of the plant architecture, which plays an important role in grain yield and cultivation practices of many crops, including sesame (Baydar, 2005;Teichmann and Muhr, 2015). The branching patterns vary markedly in sesame varieties and germplasm accessions. The wild species have many branches, while there are only few or none branches in landraces and modern cultivars. Number of flowers per leaf axil is another important contributor to sesame yield, and the plants with more flowers have the potential to set more capsules and consequently more seeds. In sesame, there can be one or more (usually three) flowers borne on short pedicels in a leaf axil. When one flower produced, the two lateral primordia differentiate into nectaries. In previous studies, it was indicated that the inheritance of branching habit (branched vs. unbranched) (Nohara, 1933;Brar and Ahuja, 1979;Baydar and Turgut, 2000) and flowers per leaf axil (mono-flower vs. tri-flower) (Nohara, 1933;Sikka and Gupta, 1947) were all determined by one single dominant gene. However, the genetic basis of them has remained elusive.
A genetic map is the prerequisite for mapping and cloning of quantitative trait loci (QTL) or genes for agronomically important traits in crops (Mackay et al., 2009;Würschum, 2012). As an orphan or neglected crop, high-density genetic map in sesame is still limited. The first genetic map of sesame was developed using an F 2 population with 96 individuals (Wei et al., 2009), and it contained 220 markers, including 8 expressed sequence tag-simple sequence repeats (EST-SSRs), 25 amplified fragment length polymorphism (AFLPs) and 187 Random Selective Amplification of Microsatellite Polymorphic Loci (RSAMPLs), which were distributed on 30 linkage groups (LGs) and spanned 936.72 cM. In 2012, an improved genetic map was constructed using an F 2 population with 260 plants derived from the same cross and used for mapping QTL of seed coat color . The linkage map covered 1,216.00 cM, containing 653 marker loci (30 EST-SSR, 50 AFLP, and 573 RSAMPL) distributed on 14 LGs, and 4 QTLs for seed coat color were mapped to LG1, LG11, and LG13, respectively. The advent of next-generation sequencing (NGS) technologies has significantly reduced sequencing load and cost (Davey et al., 2011), especially for several methods combined NGS with restriction enzyme digestion to reduce the complexity of the target genomes, such as reducedrepresentation sequencing (RRS) using reduced-representation libraries (RRLs) (Van Tassell et al., 2008), restriction-siteassociated DNA sequencing (RAD-seq) (Miller et al., 2007), genotyping-by-sequencing (GBS) (Elshire et al., 2011) and specific length amplified fragment sequencing (SLAF-seq) (Sun et al., 2013). The SLAF-seq method has several advantages such as low cost, high efficiency, and accuracy in marker development and genotyping (Sun et al., 2013). It has been widely used for high-density genetic map construction, QTL mapping and genome-wide association studies (GWAS) in soybean (Li et al., 2014;Qi et al., 2014;Zhao et al., 2015), rice (Mao et al., 2015), cotton , cucumber (Wei et al., 2014;Xu et al., 2015a,b;Zhu et al., 2016), pumpkin , wax gourd (Jiang et al., 2015), tobacco (Gong et al., 2016), and grape , including sesame (Zhang Y. et al., 2013).
The main objectives of this study were to develop a highdensity genetic linkage map using SLAF-seq method and map genes controlling basal branching habit and flowers per leaf axil in sesame.

Plant Materials and Trait Investigation
A BC 1 population including 300 individuals was developed from a cross between two sesame cultivars, Yuzhi 4 and Bengal Small-seed (hereinafter BS), with the former as maternal and recurrent parent and the latter as paternal parent. Yuzhi 4 is an elite cultivar developed by Zhumadian Institute of Agricultural Sciences (Henan, China) in 1980s, which had been widely grown in most of the sesame growing regions in China over the last three decades. Meanwhile, Yuzhi 4 is one of the most important founder parents in Chinese sesame breeding programs, from which, more than one third of the cultivars released in the Yellow River and Huaihe River regions had been derived (Liu et al., 2012). BS is an anonymous cultivar introduced from Bangladesh, with multiple basal branches, mono-flower per leaf axil, small seed size, high lignan contents, and late flowering time comparing to Yuzhi 4. Both cultivars have been artificially self-pollinated for at least six generations to ensure pure inbred lines for development of segregating populations. The BC 1 population, two parents and their F 1 were grown at Sanya experimental station (N18 • 14 , E109 • 29 ) of HSRC-HAAS (Hainan, China) in the winter of 2014. Number of branches and flowers per leaf axil were investigated for each plant, and a subset of 150 BC 1 individuals was randomly selected for genotyping and mapping analysis.

DNA Extraction and SLAF-Sequencing
Genomic DNA of the two parents and 150 BC 1 individuals were extracted from young leaves using the CTAB methods as described by Paterson et al. (1999) with some modification to the components of the CTAB buffer (11.69 g NaCl, 2 g CTAB, 2 g PVP40, 5.92 g D-Sorbitol and 1.25 g Na 2 SO 3 in a total volume of 100 ml of 20 mM EDTA, 100 mM Tris-HCl, pH 8.0) to eliminate ultra-plentiful polysaccharides in sesame leaves. Crude DNA samples were purified using a DNeasy Kit (Qiagen, Valencia, CA, USA), assessed by electrophoresis on 0.8% agarose gel and quantified using spectrophotometry (NanoDrop 8000, Thermo Scientific, USA). The SLAF libraries were constructed following the procedure described by Sun et al. (2013), except that two restriction enzymes HaeIII (recognition site 5 -GG/CC-3 , New England Biolabs, NEB, USA) and Hpy166II (5 -GTN/NAC-

SLAF-Seq Data Grouping and Genotyping
The SLAF-seq data grouping and genotyping were performed as described in detail by Sun et al. (2013). Briefly, raw reads were demultiplexed to 152 individuals according to the barcode sequences and reads with quality score lower than 20 were filtered out. High quality reads were clustered based on sequence similarity by BLAT (-tileSize = 10 -stepSize = 5) using oneto-one alignment (Kent, 2002). Identical reads were merged to reduce computational intensity, and sequences with over 90% similarity were grouped into one SLAF locus. Alleles of each SLAF were defined using the minor allele frequency (MAF) evaluation. As a diploid species, sesame can only have at most four alleles at one locus, SLAF groups containing more than four alleles were considered as repetitive and filtered out. SLAFs with 2-4 alleles were identified as polymorphic, and assorted into eight segregation patterns as following: ab×cd, ef×eg, hk×hk, lm×ll, nn×np, aa×bb, ab×cc, and cc×ab. The BC 1 population used here was derived from two homozygous inbred lines, therefore only the SLAF markers which had segregation patterns of aa×bb were used for subsequent analysis.

Genetic Map Construction
The high-density linkage map was constructed using the HighMap method . Briefly, the modified logarithm of odds (mLOD) scores was used to allocate the SLAF markers into LGs and markers with mLOD scores <5 were filtered. SLAF markers in each LG were ordered using the maximum likelihood (ML) algorithm and genotyping errors were corrected with the SMOOTH algorithm. The Kosambi mapping function was employed to convert recombination percentages to genetic distance in cM.

QTL/Gene Mapping
According to previous studies and our observation, both branching habit (branched vs. unbranched) (Nohara, 1933;Brar and Ahuja, 1979;Baydar and Turgut, 2000) and flowers per leaf axil (mono-flower vs. tri-flower) (Nohara, 1933;Sikka and Gupta, 1947) in sesame are all controlled by one single dominant gene. For flowers per leaf aixl, the phenotype of progenies in segregating populations showed clear binomial distribution (mono-flower or tri-flower), and followed a 1:1 (χ 2 = 0.03) segregation ratio, indicating that basal branching habit was controlled by one single gene so the trait was treated as a dominant marker and gene mapping was performed by linkage test with SLAF markers. For branching habit, however, the branch number of the BC 1 individuals showed continuous distribution (0 to 8). Therefore, QTL mapping strategies were employed to scan the target regions controlling branching habit. QTL mapping was performed using the IciMapping 4.1 software (Meng et al., 2015) with the inclusive composite interval mapping additive (ICIM-ADD) model (Li et al., 2007), and LOD threshold was determined by 1,000 permutations (P = 0.05).

Genotyping the BC 1 Population Using SLAF-Seq
A total of 71.15 Gb high quality sequence data including 355.78 M 100-bp pair-end reads were generated by high-throughput sequencing of the SLAF library. The average guanine-cytosine (GC) content and Q30 ratio (a quality score of 30) of the data were 39.56 and 87.14%, respectively. According to the index sequences, 12.75, 11.94, and 331.08 M reads were assigned to Yuzhi 4 (maternal and recurrent parent), BS (paternal parent), and the 150 BC 1 progenies, respectively ( Table 1). The number of reads for 150 BC 1 individuals ranged from 892,224 to 2,887,083 with an average of 2,207,207 (Supplementary Table S1). High quality pair-end reads with clear index information were clustered based on sequence similarity (Kent, 2002), and a total of 230,620 SLAF tags were obtained after filtering low-depth tags. In parental lines, 208,378 and 196,862 SLAFs were identified for Yuzhi 4 and BS, as well as the average depth were 41.91-and 37.70-fold, respectively ( Table 1). In the 150 BC 1 individuals, the number of SLAFs ranged from 123,705 to 166,623 with an average of 149,794, and the mean depth ranged from 4.19-to 15.69-fold with an average of 10.23-fold (Table 1 and Supplementary Table S2). Among the 230,620 SLAFs obtained, 42,115 were polymorphic with a polymorphic rate of 18.26%. The genotypes of two parental lines were designated with different alphabets following a genotype encoding rule to determine segregation patterns, and 36,592 of the 42,115 polymorphic SLAFs were successfully encoded and grouped into eight segregation patterns (ab×cd, ef×eg, hk×hk, lm×ll, nn×np, aa×bb, ab×cc, and cc×ab) (Figure 1). Since the two parents (Yuzhi 4 and BS) used here are homozygous inbred lines, and only the 29,980 SLAFs falling into the aa×bb segregation pattern can be used for linkage analysis.

High-Density Genetic Map Constructed with SLAF Markers
To ensure the accuracy of genotyping, the following steps were performed: (1) SLAFs with depth of less than 10-fold in each parent were discarded, (2) SLAFs with more than three SNPs were removed, (3) SLAFs with depth of less than twofold in each of the 150 progenies were defined as missing data, and (4) SLAFs with more than 30% missing data or with excessively distorted segregation ratios (χ 2 test, P ≤ 0.01) were excluded. Therefore, a final set of 9,823 high quality SLAF markers were used for genetic map construction using the HighMap method . As a result, a high-density genetic map with 13 LG and 9,378 SLAFs was constructed (Supplementary Table S3 and Data S1

QTL/Gene Mapping of Basal Branching Habit and Flowers Per Leaf Axil
Yuzhi 4 is a cultivar with uniculm (unbranched type) and tri-flower per leaf axil, while BS is a line with branched type and mono-flower per axil. The F 1 plants all showed branched and mono-flower phenotype, which confirmed the dominant nature of gene action for mono-flower to tri-flower and branched to unbranched types (Figure 3). The number of branches of 150 BC 1 plants showed continuous distribution with a range of 0-8 (Figure 4). At first, QTL analysis was performed using the software QTL IciMapping 4.1 with ICIM-ADD model (Li et al., 2007). In this way, a major QTL (designated as qBH-LG5) with LOD score of 50.27 was detected on the LG 5 at the position of 2.50 cM, neighbored by Marker41538 and Marker31462 with a confidence interval of 1.75 to 2.75 cM, which explained 78.64% of the phenotypic variation (PVE) (Figures 5, 6A). If plants with FIGURE 1 | Number of SLAF markers in each segregation pattern. The x-axis indicates eight segregation patterns grouped by genotype encoding rule and the y-axis represents number of SLAF markers.
Frontiers in Plant Science | www.frontiersin.org

FIGURE 2 | Distribution of SLAF markers on 13
LGs of sesame. A black bar indicates a marker. The x-axis indicates LG and the y-axis represents genetic distance (cM as unit). branches ≤2 were classified as "unbranched type" and ≥3 as "branched type, " the branched and unbranched plants in the BC 1 populations were 76 and 74, respectively, which followed a 1:1 (χ 2 = 0.03) segregation ratio, indicating that basal branching habit was controlled by one single gene (designated as SiBH). Interestingly, the SiBH gene was mapped to LG5 at 2.385 cM by linkage analysis, and the position was almost as same as that of qBH-LG5. Three markers (Marker129539, Marker41538, and Marker31462) were found to be tightly linked. Among them, marker41538 was co-segregated with SiBH and the genetic distance of Marker129539 and Marker31462 in the flanking regions were 0.68 and 0.22 cM, respectively ( Figure 6A). For flowers per leaf axil, 85 of the 150 BC 1 individuals showed mono-flower and 65 showed triple-flower per axil (χ 2 = 2.67 for 1:1 ratio), which confirmed that flowers per axil was controlled by one single gene (designated as SiFA). In the high-density map, the SiFA gene was mapped to LG11 at 128.56 cM, and three markers (Marker58311, Marker34507, and Marker36337) were found to be tightly linked. Among them, marker34507 was co-segregated with SiFA, and the genetic distance of Marker58311 and Marker36337 in the flanking regions were 0.26 and 0.64 cM, respectively ( Figure 6B).

High-Density Genetic Map Constructed in Sesame
The first high-density genetic map of sesame was developed using the SLAF-seq method with an F 2 population including 107 individuals (Zhang Y. et al., 2013). The map contained 1,233 SLAF markers on 15 LGs, and covered 1,474.87 cM in total length, with an average distance of 1.20 cM. Wu et al. (2014) constructed a genetic map using the RAD-seq method with a population of 224 recombinant inbred line (RIL) lines. The map was consisted of 1,230 markers distributing on 14 LGs and spanned 844.46 cM in length with an average of 0.69 cM between adjacent markers. Using this map and RIL population, they detected 13 QTLs on 7 LGs and 17 QTLs on 10 LGs for seven yield-related traits by the multiple interval mapping (MIM) and the mixed linear composite interval mapping (MCIM), respectively. In this study, we constructed a high-density genetic map using the SLAF-seq method with a BC 1 population including 150 individuals. This map comprised 9,378 SLAF markers on 13 LGs, and covered a genetic distance of 1,974.23 cM in total, with an average distance of 0.22 cM between adjacent  LGs and the y-axis is LOD score. markers (Figure 2 and Supplementary Table S3). Compared to previously developed high-density genetic maps in sesame, the map constructed in this study had the highest marker density, fewer and smaller gaps, and equal number of LGs to Sesamum indicum L. chromosomes (2n = 26) ( Table 3). Wang et al. (2016) sequenced 430 RILs using the RAD-seq method and developed a bin map by joining the consecutive intervals on the genome that lacked a recombination event within the population. The map included 1,522 bins on 13 LGs (SLGs), with a total length of 1090.99 cM and a mean distance of 0.72 cM between adjacent bins. The map was used to improve Zhongzhi 13 genome assembly from version 1.0 to version 2.0. This map was the first one developed with large population in sesame. In theory, large population size allows precisely estimate recombination events between linked markers and ultimately determines map resolution and accuracy (Ferreira et al., 2006). In spite of this, there still were five scaffolds with lengths greater than 150 kb unanchored in the improved Zhongzhi 13 genome assembly version 2.0 . This may be caused by low genetic dissimilarity  between the two parents or these scaffolds containing high repetitive sequence. Since the bin map strategy was adopted in their study, the properties of the map were not easy to compare with that of the map constructed in our study. In our study, among the 230,620 SLAF markers obtained, 42,115 were polymorphic with a polymorphic rate of 18.26%, which was higher than that reported by Zhang Y. et al. (2013) and Wu et al. (2014). Even if the redundant markers (markers on the same locus) are merged, there still are 4,006 loci in our genetic map (Supplementary Table S3). Although redundant markers cannot provide additional information, we have showed them in the current high-density genetic map and hope that it may serve as another framework for QTL mapping and genome assembly improvement.

Inheritance of Branching Habit and Flowers Per Leaf Axil in Sesame
Branching habit can influence seed yield and cultivation practices, which will play an important role in development of cultivars suitable for mechanized harvest in sesame production. Nohara (1933) reported that the branching habit was dominant, and suggested a single gene difference but also commented about the difficulty in classification due to fluctuation brought about by environment. Brar and Ahuja (1979) reported that the branching habit was monogenically controlled, and the monostem (unbranched) characteristic was controlled by recessive gene. Baydar and Turgut (2000) observed 3:1 segregation ratio for branching and non-branching types. However, branching habit are often influenced by environmental factors. For example, high density planting can suppress branch number of branched types, and even unbranched type can produced more than one branches in low density populations.
Other environmental factors influencing branching habit include photoperiod, light levels and quality, plant nutrition status, and availability of nutrients. Factors mentioned above usually lead to a continuous distribution of number of branches between the branched and unbranched types, the plants at the edges can be easily identified, but the ones in the middle are difficult to classify. Wang et al. (2009) performed both linkage test and QTL mapping to map a Fusarium wilt resistant gene in upland cotton, and a major QTL and the FW R gene were both detected in the same genomic region. The development of Fusarium wilt disease symptom in cotton is affected by several environmental factors, which is similar to branch development in sesame. Therefore, both linkage analysis and QTL mapping strategies were employed to mapping the target genes in this study. In the results, a single major QTL qBH-LG5 with LOD score of 50.27 and 78.64% of the PVE explanation was identified on LG5 at position of 2.50 cM by QTL mapping, as well as the SiBH gene was mapped to LG5 at 2.385 cM by linkage analysis and the position was almost as same as that of qBH-LG5 ( Figure 6A). It is noteworthy that, in spite of single gene inheritance for branching habit were found in the current and previous studies, the branching patterns in sesame are more complex. The International Plant Genetic Resources Institute (IPGRI) described branching habit into non-branching, basal-branching, top-branching, and other types in sesame germplasm characterization (IBPGR, 2004), and the International Union for the Protection of New Varieties of Plants (UPOV) classified branching patterns into absent or very few, medium and very many types according to number of branches, and into basal branching, branching along stem and apical branching types according to position of branches in distinctness, uniformity, and stability (DUS) test for protection of intellectual property of new varieties 2 . The branching patterns mentioned above were all found in the characterization of a 5200-accession germplasm collection conserved in our library. Whether if basal, top branching and branching along stem types are all controlled by one single gene, or the genes are allelic, and genetic basis of medium and very many branch-types are needed further detailed research.
Comparing to branching habit, the inheritance of number of flowers per leaf axil is much simple. It was indicated by previous studies that flowers per leaf axil was controlled by one single gene with dominant effect of mono-flower vs. tri-flower (Nohara, 1933;Sikka and Gupta, 1947). In this study, the SiFA gene 2 http://www.upov.int/resource/en/dus_guidance.html behavior and genetic effect were validated and mapped to LG11 at 128.56 cM. The tightly linked markers for basal branching habit (Marker129539, Marker41538, and Marker31462) and flowers per leaf axil (Marker58311, Marker34507, and Marker36337) could provide useful information for further MAS breeding and map-based gene cloning studies.

CONCLUSION
In this study, a total of 71.15 Gb high quality sequence data were generated by high-throughput sequencing of the SLAF libraries, and a high-resolution genetic map with 9,378 SLAF markers was developed. The map comprised 13 LGs, which equaled the number of sesame chromosomes, spanned a total genomic distance of 1,974.23 cM, and the mean LG length was 151.86 cM, with an average genetic distance of 0.22 cM between adjacent markers. Based on the map, genes for basal branching habit (SiBH) and flowers per leaf axil (SiFA) were mapped to LG5 and LG11, respectively. The results will not only provide a platform for QTL/gene fine mapping, map-based gene cloning, and molecular breeding for sesame, but also provide a reference linkage map to help anchor sequence scaffolds of the physical map in improvement of genome assembly.

AUTHOR CONTRIBUTIONS
HM and ZD constructed the BC 1 population. HM and YL performed data analysis and linkage mapping. HM wrote and YL revised the manuscript. KW, CC, and XJ performed some of the field work and took part in DNA extraction. HZ provided valuable research ideas. YZ designed and supervised the study. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
This study was supported by the China Agriculture Research System (CARS-15) and the Independent Innovation Project (201412011, 2017ZC67) from Henan Academy of Agricultural Sciences.
Frontiers in Plant Science | www.frontiersin.org