ORIGINAL RESEARCH article

Front. Plant Sci., 31 October 2017

Sec. Plant Breeding

Volume 8 - 2017 | https://doi.org/10.3389/fpls.2017.01873

An RNA Sequencing Transcriptome Analysis of Grasspea (Lathyrus sativus L.) and Development of SSR and KASP Markers

  • 1. Key Laboratory of Crop Gene Resources and Germplasm Enhancement on Loess Plateau, Ministry of Agriculture, Shanxi Key Laboratory of Genetic Resources and Genetic Improvement of Minor Crops, Institute of Crop Germplasm Resources, Shanxi Academy of Agricultural Sciences, Taiyuan, China

  • 2. National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China

  • 3. USDA-ARS Western Regional Plant Introduction Station, Pullman, WA, United States

  • 4. Department of Leguminous Crops Genetic Resources, N.I.Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russia

Abstract

Grasspea (Lathyrus sativus L., 2n = 14) has great agronomic potential because of its ability to survive under extreme conditions, such as drought and flood. However, this legume is less investigated because of its sparse genomic resources and very slow breeding process. In this study, 570 million quality-filtered and trimmed cDNA sequence reads with total length of over 82 billion bp were obtained using the Illumina NextSeqTM 500 platform. Approximately two million contigs and 142,053 transcripts were assembled from our RNA-Seq data, which resulted in 27,431 unigenes with an average length of 1,250 bp and maximum length of 48,515 bp. The unigenes were of high-quality. For example, the stay-green (SGR) gene of grasspea was aligned with the SGR gene of pea with high similarity. Among these unigenes, 3,204 EST-SSR primers were designed, 284 of which were randomly chosen for validation. Of these validated unigenes, 87 (30.6%) EST-SSR primers produced polymorphic amplicons among 43 grasspea accessions selected from different geographical locations. Meanwhile, 146,406 SNPs were screened and 50 SNP loci were randomly chosen for the kompetitive allele-specific PCR (KASP) validation. Over 80% (42) SNP loci were successfully transformed to KASP markers. Comparison of the dendrograms according to the SSR and KASP markers showed that the different marker systems are partially consistent with the dendrogram constructed in our study.

Introduction

Grasspea (Lathyrus sativus L.) is a very promising cool-season annual legume crop in many parts of the world. This plant can tolerate abiotic stress, such as drought, salinity, and flood (Kumar et al., 2011; Jiang et al., 2013; Piwowarczyk et al., 2016; Zhou et al., 2016). Grasspea also plays an important role in many low-input farming systems (Patto et al., 2006). However, this plant has several disadvantageous traits, such as containing a neurotoxin [i.e., β-N-oxalyl-L-α, β-diaminopropionic acid (β-ODAP)], indeterminate and prostrate growth habit, delayed maturation, and pod shattering (Rybinski, 2003; Yan et al., 2006; Enneking, 2011). These drawbacks impede large-scale grasspea production.

These undesirable traits of grasspea can be improved with suitable breeding strategies. For example, donor germplasm with the desirable phenotypes can be used to form new breeding materials, and available genomic tools can be applied to expedite the breeding process. However, genomic resources for grasspea are still scarce compared with other food legume crops, because grasspea has a big genome size of 8.2 Gb (Bennett and Leitch, 2012). The reference genome sequence for grasspea is unlikely to be available in the near future. Next-generation sequencing (NGS) technologies have been applied for transcriptome characterization, which is a cost-effective tool to enrich the knowledge in the genomics of grasspea. Almeida et al. (2014) generated the first comprehensive transcriptome assemblies from control and Uromycespisi-inoculated leaves of a susceptible and a partially rust-resistant grasspea genotype by RNA-Seq (Almeida et al., 2014).

The study of grasspea sequencing based on NGS technologies and marker development is limited by low investment and scarcity of related reports. Yang et al. (2014) developed 50,144 non-redundant SSR primers, of which 288 were randomly selected for validation among 23 L. sativus and one Lathyrus cicera accessions for diversity analysis. Among the 288 markers, 74 (25.7%) were polymorphic, 70 (24.3%) were monomorphic, and 144 (50.0%) did not amplify any PCR product (Yang et al., 2014). Almeida et al. (2014) used RNA-Seq technology to develop 200 EST-SSR markers. Among these markers, 40 markers were validated with 25 (62.5%) polymorphic between two accessions, 6 (15.0%) monomorphic, 5 (12.5%) produced a very complex pattern and the remaining 4 (10.0%) with no PCR product. Furthermore, they identified 2,634 contigs containing SNP (Almeida et al., 2014).

Kompetitive allele specific PCR (KASP) genotyping assays are based on competitive allele-specific PCR and enable bi-allelic scoring of SNPs and insertions and deletions at specific loci. This flexible and cost-effective genotyping platform was developed by LGC Limited (Fleury and Whitford, 2014; Michael, 2014; Semagn et al., 2014). KASP assays have been used in maize (Mammadov et al., 2012), wheat (Neelam et al., 2013) and peanut (Khera et al., 2013). To our knowledge, the development of KASP markers for grasspea has not been reported yet.

In this paper, we chose two different grasspea accessions and sequenced a mixture of root, stem and leaf DNA collected at the seedling stage by RNA-Seq to supply the reference of transcriptome information of grasspea. At the same time, we developed some SSR and SNP markers, which will be useful for molecular plant breeding in the future.

Materials and methods

Plant materials and RNA isolation

Two grasspea (L. sativus) accessions, in particular, one each from Africa (RQ23) and one from Europe (RQ36), were used. Each accession was sampled thrice as replications and labeled as RQ23-1, RQ23-2, RQ23-3, RQ36-1, RQ36-2, and RQ36-3 for RNA-Seq sequencing with an Illumina NextSeqTM 500.

A set of 43 grasspea (L. sativus) accessions were used in the SSR and KASP marker tests. These germplasm resources originated in roughly 11 different geographical regions as follows: 5 accessions from Eastern Asia, 3 from Central Asia, 5 from Southern Asia, 1 from Western Asia, one from Eastern Europe, 4 from Central Europe, 1 from Northern Europe, 3 from Western Europe, 14 from Southern Europe, 4 from Eastern Africa and 2 from Northern Africa.

The seed samples were obtained from the Institute of Crop Germplasm Resources, Shanxi Academy of Agricultural Sciences, Taiyuan, China. Detailed information is given in Table S4.

RNA-Seq library preparation and illumina sequencing

RNA from each of the samples, which included mixtures of root, stem and leaf in the seedling stage (3 weeks after sowing), was extracted using the RNA prepPure Plant Kit (Tiangen, Beijing, China) according to manufacturer's instructions. Oligo-dT labeled magnetic beads (Illumina Inc., San Diego, USA) were used to combine the polyA of the mRNA for purifying the mRNA. Then mRNA was mixed with fragmentation buffer to obtain short fragment RNAs with the size of 200–300 bp. Then, the short fragment RNAs were used to synthesize the first-strand cDNA with random primers, and this cDNA was transformed into double-strand cDNA using RnaseH and DNA polymerase I. Fragments of desirable lengths (200–300 bp) were purified by the QIAquick PCR Extraction Kit (Qiagen, Valencia, CA, USA). Under the function of 3′-5′ exonuclease and polymerase, the protruding termini of the DNA fragments were end-repaired. The end-repaired DNA fragments were ligated with sequencing adapters through A and T complementary base pairing. Then, AMPure XP beads (Beckman Coulter, Shanghai, China) were used to remove unsuitable fragments. The sequencing library was constructed by PCR. The multiplexed cDNA libraries were tested using PicoGreen (Quantifluor™-ST fluorometer E6090, Promega, CA, USA) and fluorospectrophotometry (Quant-iT PicoGreen dsDNA Assay Kit; Invitrogen, P7589) and quantified with Agilent 2100 (Agilent 2100 Bioanalyzer, Agilent 2100; Agilent High Sensitivity DNA Kit, Agilent, 5067–4626). Furthermore, the synthesized cDNA libraries were normalized to 10 nM. Finally, the sequencing library was gradually diluted and quantified to 4–5 pM and sequenced on the Illumina NextSeq™ 500 system. The raw data were deposited in the Sequence Read Archive (SRA) in NCBI as SRP092875.

Data filtering and de novo assembly

After the sequencing of the Illumina paired-end, the raw reads were filtered by removing the adapter sequences, reads that contain unknown bases of more than 10%, and reads with a low quality score (Q < 20). Trinity, r20140717 (https://github.com/trinityrnaseq/trinityrnaseq/wiki) was used to assemble high-quality reads into contigs and transcripts, and the k-mer was equal to 25. Data redundancy was reduced by clustering the transcripts by blasting against the nr protein database with a cut-off e-value of 1e−5. Then, the longest sequences in each cluster were reserved as unigenes.

Unigene annotation and classification

The unigenes were aligned with BLASTX to five protein databases, namely, NCBI non-redundant protein sequences (Nr) (with a cut-off e-value of 1e−5), Gene Ontology (GO) (using Blast2go and map2slim software), Kyoto Encyclopedia of Genes and Genome (KEGG) (using bi-directional best hit method), and evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG) (with a cut-off e-value of 1e−5) and Swiss-Prot (with a cut-off e-value of 1e−5).

Aligning an annotated grasspea gene with its homologue in pea

The pea SGR mRNA sequences on NCBI were searched using keywords AB303331 and AB303332. Grasspea unigenes were transformed into a blast database by using CLC Genomics Workbench 9_0_1 software (CLC Inc., Aarhus, Denmark). Then AB303331 and AB303332 data sets were blasted against the grasspea unigene database with default parameters. The grasspea unigene with the most significant alignment score aligned with the gene, c39901_g1_il, a senescence-inducible chloroplast stay-green (SGR) protein. c39901_g1_i1 was SGR homolog gene in grasspea. Then c39901_g1_i1, AB303331 and AB303332 were translated to protein using ORF finder software on the NCBI website (https://www.ncbi.nlm.nih.gov/orffinder/). The longest translated protein was selected for further analysis. Finally, multi-aligning was finished by the ClustalW Multiple function of BioEdit version 7.3.5 (12/22/2013) (Hall, 1999).

SSR search and primer design

High-throughput SSR search was performed by microsatellite identification tool (version 1.0, http://pgrc.ipk-gatersleben.de/misa/misa.html). The parameters were set as follows: minimum SSR motif repeat length of mono-10, di-6, tri-5, tetra-5, penta-5, and hexa-5. The maximum size of interruption allowed between two different SSRs in a compound SSR was 100 bp. SSR primer pairs were designed based on flanking conserved sequences and the microsatellite loci were selected using the Primer 3.0 (https://sourceforge.net/projects/primer3/) (Rozen and Skaletsky, 2000).

DNA extraction and PCR amplification

Genomic DNA was extracted from the fresh leaves of seedlings (3 weeks after sowing) of 43 accessions using the Rapid Plant Genomic DNA Isolation Kit (B518231-0100, SangonBioteck, Shanghai, China) and the laboratory procedures were conducted strictly according to the manufacturer's instructions. DNA qualities were tested by the BioTek Synergy H1 and the PCR concentration of DNA was diluted to 30 ng/μL to confirm the markers. Amplification reaction system was as follows: 10 μL volume containing 0.1 μL of Taq DNA polymerase (5 U/μL, Aidlab, Beijing, China), 2 μL of primers (12 ng/μL, Personalbio, Shanghai, China), 1 μL 10 × buffers (Aidlab, Beijing, China), 0.25 μL of dNTP (10 mM, SangonBioteck, Shanghai, China), 5.15 μL of ultrapure water (Millipore Direct-Q3), and 1.5 μL of genomic DNA (30 ng/μL). Microsatellite loci were amplified on the C1000 Thermal Cycler (Bio-rad, USA). PCR amplification was performed under the following cycling conditions: primary of one cycle for 5 min at 95°C; 35 cycles at 95°C for 30 s, 52°C for 45 s, and 72°C for 45 s; and final extension at 72°C for 10 min. The PCR products were tested by 6.0–8.0% non-denaturing Polyacrylamide gel electrophoresis using silver nitrate staining for visualization.

SSR marker polymorphic validation

The parameters of genetic diversity were determined by calculating the screening data of SSR markers, using PowerMarker (Version 3.25) (http://statgen.ncsu.edu/powermarker/). These parameters included the major allele frequency, number of alleles, gene diversity (GD), heterozygosity, and polymorphic information content (PIC) in SSR polymorphic markers.

SNP search

The Bowtie2 (version 2.2.4) software was used to map the high-quality reads to unigenes according to the default parameter. Then, Samtools (version 1.1) (Li et al., 2009) was used to generate bam files. Varscan (version 2.3.7) (Koboldt et al., 2012) was used to call SNP according to the parameter as follows: mincoverage [8], min reads [2], min varfreq [0.2], min avgqual [15], p-value [0.01].

KASP primer design and validation

KASP primers were designed according to the standard KASP guidelines. The allele-specific primers were designed carrying the FAM (5′–GAAGGTGACCAAGTTCATGCT-3′) and HEX (5′-GAA-GGTCGGAGTCAACGGATT-3′) tails with the targeted SNP at the 3′ end. 1,536-well plates were used to genotype each sample with 1 μL of reaction mix as follows: dry DNA, 0.5 μL of 2× Master mix, 0.014 μL of Primer mix, and 0.486 μL of ddH2O. All reagents were briefly vortex-mixed prior to use. The KASP thermal cycling program was as follows: 94°C for 15 min; then 10 cycles at 94°C for 20 s, 61–55°C for 60 s (decrement of 0.6°C per cycle); and 26 cycles at 94°C for 20 s and 55°C for 60 s. Fluorophores FAM and HEX were used to distinguish genotypes. Snpviewer2 was used to view the result of KASP markers. Major allele frequency, number of alleles, GD, heterozygosity and PIC in the SNP polymorphic markers were calculated and these indices were the same as those in the validation of SSR markers.

Construction of phylogenetic dendrograms

Phylogenetic dendrograms were constructed based on the screening data of SSR and SNP markers. PowerMarker (version 3.25) was used to calculate the frequency and genetic distance (Nei and Roychoudhury, 1974) and build the phylogenetic original tree and bootstrap consensus tree by Unweighted pair-group method with arithmetic means (UPGMA), which was based on bootstrap 1,000 times. Eventually, the dendrograms were drawn by MEGA (version 5.1) (http://www.megasoftware.net/download_form).

Results

Illumina sequencing and de novo assembly of the grasspea

A total of 111.8, 86.9, 107.7, 90.0, 84.3, and 99.1 million raw reads were generated by the Illumina NextSeqâ„¢ 500 system for RQ23-1, RQ23-2, RQ23-3, RQ36-1, RQ36-2, and RQ36-3, respectively. After removal of the adaptor and low quality reads, approximately 109.7, 85.2, 106.1, 88.1, 83.2, and 98.0 million clean reads remained for RQ23-1, RQ23-2, RQ23-3, RQ36-1, RQ36-2, and RQ36-3, respectively. The combined sequences of these clean reads were assembled into 142,053 transcripts and 27,431 unigenes. Table 1 shows that the N50 of the transcript was 1,294 bp and the average length was 846 bp. Meanwhile, the N50 of the unigenes was 1,781 bp and the average length was 1,250 bp.

Table 1

TranscriptUnigene
Total Length (bp)120,219,75534,278,781
Sequence Number142,05327,431
Max. Length (bp)48,51448,514
Mean Length (bp)8461,250
N50 (bp)1,2941,781
N50 Sequence No.28,6236,486
N90 (bp)352637
N90 Sequence No.97,05118,215
GC%39.7740.28

Characteristics of de novo assembly of the grasspea by Trinity software in this study.

Gene annotation and functional classification

A total of 27,431 unigenes provided a significant BLASTX result with 27,431 (100%) showing significant similarity to NCBI non-redundant (Nr) protein sequences and 19,867 (72.4%) from Swiss-Prot (Figure 1). The transcriptome of grasspea was functionally annotated using BLAST2GO according to the default parameter (Conesa et al., 2005; Götz et al., 2008). Map2slim script mapped the gene association file (containing annotations to the full GO) to the terms in the GO slim. Figure 2 shows that metabolic process was the most frequent category in biological processes, the cell was the most frequent category in cellular component, and binding was the most frequent category in molecular function.

Figure 1

Figure 2

Meanwhile, eggNOG annotation was finished by blasting against the eggNOG (Version 4.0) database. A total of 25,822 unigenes were annotated. Figure 3 shows that unknown function and general function prediction were the most frequent categories. Undetermined and cell motility were the least frequent categories. KEGG pathway was analyzed in our study. Figure 4 shows that carbohydrate metabolism, transcription, signal transduction, cell growth and death, endocrine system and infectious diseases were the most frequent categories in metabolism, followed by genetic information processing, environmental information processing, cell processes, organismal systems and human diseases. Interestingly, c39901_g1_i1 annotated for senescence-inducible chloroplast SGR protein was the SGR gene in grasspea. Sato et al. (2007) found that pea SGR was Mendel's green cotyledon gene (I/i) encoding a positive regulator of the chlorophyll-degrading pathway in pea. Figure 5 illustrates that the SGR gene of grasspea was aligned with the SGR gene of pea with high similarity (Sato et al., 2007; Hradilová et al., 2017).

Figure 3

Figure 4

Figure 5

Polymorphic validation of EST-SSR markers

A total of 3,204 EST-SSR primers were designed (Table S2) and 284 (Table S3) were randomly selected for validation. The EST-SSR markers were validated with 43 grasspea accessions mentioned previously. The result showed that 87 polymorphic and 88 monomorphic markers were confirmed, which accounted for the 30.6 and 31.0% of 284 markers, respectively.

Table 2 shows that the number of alleles was from 2 to 8 with a mean value of 3.6. The PIC values varied from 0.0848 to 0.7425 with mean value of 0.4158. These results, suggested that these EST-SSRs were informative markers and useful for marker-assisted breeding in the future.

Table 2

No.MarkerAllele no.GeneDiversityHeterozygosityPIC
1960.76330.09520.7312
21040.38180.00000.3473
31170.74340.16280.7043
41360.54880.00000.5167
51720.16870.00000.1545
61830.55810.16280.4748
71950.38830.00000.3614
82230.51050.32560.4195
92450.37720.02330.3579
102740.50540.02330.4260
113230.53650.00000.4342
123430.54980.13950.4498
133720.18740.06980.1698
143830.37430.00000.3308
154120.45430.00000.3511
164230.57440.04650.5113
174450.50270.06980.4513
184630.62240.00000.5436
194830.33500.11630.3076
204950.40750.13950.3831
215040.32560.00000.3097
225160.66660.11630.6378
235850.65330.02380.5923
246130.39160.00000.3310
256240.37480.00000.3350
266320.48670.00000.3683
276430.54300.09300.4741
286530.51270.00000.4030
296630.55250.23260.4763
306730.52220.06980.4080
317030.57410.02330.5109
327230.60880.00000.5390
337340.47700.09300.4147
347720.08870.00000.0848
358630.55270.18600.4590
369730.42480.04650.3522
3710020.49320.00000.3716
3810450.67210.30950.6141
3911040.50230.00000.4320
4011640.53650.04650.4904
4112020.47810.00000.3638
4213150.68440.13950.6225
4313330.44290.02440.4013
4413450.39160.04650.3685
4514360.73800.16280.6959
4614450.56280.13510.4673
4714760.75420.20930.7191
4814840.65520.09300.6076
4915530.39740.00000.3616
5015850.77850.16280.7425
5116040.31670.02330.2890
5216240.67630.32560.6161
5316530.55600.18600.4956
5416750.27180.02330.2606
5516830.31040.00000.2746
5617440.59790.06980.5523
5717930.53220.00000.4721
5818020.30290.00000.2570
5918330.51620.16280.4228
6019340.34690.04650.3120
6119540.21170.04650.2003
6220020.47700.21430.3633
6320230.33960.00000.2956
6420340.64140.04650.5776
6520540.40100.04650.3509
6620630.22580.11630.2050
6721040.64140.25580.5788
6821440.32130.04650.2997
6922140.52980.02380.4407
7022620.40820.00000.3249
7122820.34420.11630.2850
7223040.42270.18600.3963
7323120.12980.00000.1214
7423520.40820.00000.3249
7524030.57900.23260.5024
7624430.21200.00000.2010
7725020.34420.20930.2850
7825350.54840.00000.4607
7925750.62660.23260.5742
8026120.35690.09300.2932
8126930.50000.00000.4275
8227020.12980.00000.1214
8327240.25650.09520.2436
8427320.34420.06980.2850
8527940.66170.23260.6053
8628120.45430.13950.3511
8728380.75930.33330.7300

Results of 87 effective primers screening in 43 accessions of Lathyrus sativus L.

Polymorphic validation of KASP markers

A total of 146,406 SNP (Table S4) were detected in this study and 50 SNP loci (Table S5) were randomly selected for KASP validation. Consequently, 42 SNP loci were successfully transformed to KASP markers. Two of these loci were monomorphic and the others were polymorphic among 43 accessions. Table 3 shows that the PIC values ranged from 0 to 0.3750 with an average of 0.2457. Comparative results show that the KASP markers were less informative than EST-SSR markers for the lower PIC values. Since the transform ratio from SNP to KASP markers was high, the SNP markers associated with desirable traits will be easily converted to KASP for marker-assisted selection in the future.

Table 3

No.MarkerAllele No.Gene DiversityHeterozygosityPIC
1c241372.00000.47300.11630.3611
2c290652.00000.27260.09300.2354
3c294702.00000.46400.19510.3564
4c318762.00000.49970.07140.3749
5c319092.00000.04540.00000.0444
6c340572.00000.46730.18600.3581
7c341122.00000.24010.13950.2113
8c341382.00000.10950.06980.1035
9c343202.00000.25660.06980.2237
10c354462.00000.08870.00000.0848
11c366972.00000.36910.20930.3010
12c369722.00000.49760.09300.3738
13c369822.00000.40240.04650.3214
14c375781.00000.00000.00000.0000
15c403442.00000.08870.00000.0848
16c407332.00000.10950.06980.1035
17c408732.00000.43080.16280.3380
18c411372.00000.48270.06980.3662
19c415782.00000.43080.25580.3380
20c417072.00000.49570.30230.3728
21c417612.00000.12980.04650.1214
22c418331.00000.00000.00000.0000
23c419242.00000.46730.13950.3581
24c419592.00000.25660.06980.2237
25c427812.00000.10950.06980.1035
26c431922.00000.04540.00000.0444
27c435982.00000.50000.20930.3750
28c438502.00000.49760.27910.3738
29c438802.00000.47300.06980.3611
30c441082.00000.39180.16280.3151
31c441122.00000.46730.13950.3581
32c441752.00000.44700.20930.3471
33c445592.00000.35690.13950.2932
34c450882.00000.49970.13950.3749
35c453962.00000.46110.20930.3548
36c454322.00000.12980.04650.1214
37c457862.00000.02300.02330.0227
38c460692.00000.20550.09300.1844
39c462612.00000.06730.02330.0651
40c470182.00000.35690.00000.2932
41c470652.00000.36910.06980.3010
42c473392.00000.49760.69770.3738

Results of 42 KASP primers screening in 43 accessions of Lathyrus sativus L.

Comparison of dendrograms according to SSR and KASP markers

Two types of UPGMA dendrograms, including an original tree and a bootstrap consensus tree were constructed based on the genotype data of 87 SSR and 40 KASP markers for the 43 accessions. The frequency was calculated using Nei's genetic distance coefficient (Nei and Roychoudhury, 1974) and bootstrapping 1,000 times. The 11 subgroups based on different geographical origin were grouped into five groups as follows: (1) Southern Europe, Central Europe, Southern Asia, Eastern Africa, Central Asia, Eastern Asia and Western Europe; (2) Northern Europe; (3) Eastern Europe; (4) Western Asia; and (5) Northern Africa (Table S1). Although minor differences exist between the two dendrograms, the relationships among subgroups are similar. The results suggest that these SSR and KASP markers are useful for assessing genetic diversity of grasspea genetic resources (Figures 6A,B). Through the bootstrap consensus analysis, bootstrap values > 30% were presented. The results showed that the accessions from Eastern Europe, Northern Europe, Western Asia and Northern Africa were in one group that was supported by bootstrap values between 70 and 95% based on SSR markers. Other accessions were weakly supported because of bootstrap values < 50%. In the terms of KASP markers, accessions from Eastern Europe, Western Asia and Northern Europe similarly fell into one group, except for Northern Africa, which showed partial consistency as well (Figures 7A,B).

Figure 6

Figure 7

Discussion

Grasspea is an orphan crop with great potential

Grasspea is an under-researched legume crop with a big genome. This legume, as a minor crop, is very important in the arid and semi-arid regions, such as the Mediterranean, Middle East region and Southern Asian subcontinent, in particular, Italy, Spain, Egypt, Greece, Turkey, Ethiopia, Syria, India and Bangladesh (Patto et al., 2006; Yan et al., 2006; Dixit et al., 2016). Several research groups have paid attention to grasspea and its wild relatives (L. cicera L.) because of the high resistance of these plants to both abiotic and biotic stresses, such as drought, flooding and saline-alkali, certain diseases and pests (Wang et al., 2015). However, grasspea is difficult to apply for large scale agricultural production worldwide because of its big genome (8.2 Gb), variable outcrossing rate (2–30%) (Rahman et al., 1995; Chowdry and Slinkard, 1997; Hillocks and Maruthi, 2012) and the presence of β-ODAP in its seed. With the help of NGS, the complex trait-related genes are easier to determine with RNA-Seq, RAD-Seq, Chip-Seq and GBS technologies (Singh and Singh, 2015). In the study, we applied RNA-Seq for two different accessions, namely, one from Africa and the other from Europe, to assemble the gene reference of grasspea. The result will be useful for gene mining and molecular breeding to improve grasspea.

SSR markers are still effective and useful

The SSR markers are co-dominant, have abundant polymorphism and ubiquity in many eukaryotic species (Zietkiewicz et al., 1994), have high repeatability and are user-friendly. SSR is a powerful marker for germplasm evaluation and smart breeding. Numerous research groups use this molecular tool in many research fields, such as genetic diversity, DNA fingerprint, genetic linkage map, QTL mapping and allele mining (Zietkiewicz et al., 1994; Jun et al., 2008; Zhao et al., 2010; Soren et al., 2015). However, limited SSR markers are available for this orphan grasspea crop compared with other crops (Sun et al., 2012; Lioi and Galasso, 2013; Almeida et al., 2014; Yang et al., 2014). In this study, we implemented RNA-Seq with two different accessions, and RNAs from mixed root, stem and leaf tissues were sequenced. We identified 5,916 SSR markers from the resulting sequence data, designed primer pairs and validated 284 of these markers. Our results showed that 87 (30.6%) of the SSRs were polymorphic and 88 (31.0%) were monomorphic. The rest of the identified SSRs either had no target bands or were too complex to be recognized.

KASP markers are new and powerful

KASP is a new and powerful tool for SNP testing. Although many SNP testing methods, such as Allele-Specific PCR, Taqman Assay, Molecular Beacons and Microarray-Based SNP Genotyping, are available, they are very expensive (Singh and Singh, 2015). KASP is a new way of SNP genotyping assay based on Allele-Specific PCR with two different forward primers and a reverse primer (Graves et al., 2016). This assay is not only accurate and highly efficient, but is also inexpensive (Khera et al., 2013; Lister et al., 2013). In this study, we used KASP technology to validate 50 SNP primers among 43 grasspea genotypes. The results showed that the array of the 40 SNPs was successfully tested with polymorphism. Two SNPs were monomorphic, and eight markers failed detection. The PIC mean value was 0.2457, which was less than that of the SSR markers, because SNP markers have only two types of alleles, contrary to SSR markers. However, SNPs are very important and widely used because of their stability, dependability and high-throughput. They are highly efficient and accurate for gene discovery (Klepadlo et al., 2017).

Conclusion

RNA-Seq was performed with two different grasspea accessions with thrice replications for each accession. Sequencing depth was more than 12 Gb for each sample. Based on the de novo assembly of sequencing data, 1,970,104 contigs, 142,053 transcripts and 27,431 unigenes were confirmed. A total of 284 SSR markers were validated, 30.6% markers were polymorphic and 31.0% markers were monomorphic among 43 collected grasspea accessions worldwide. For SNP markers, 146,406 SNP loci were called, 50 SNP markers were tested through the 43 grasspea accessions and 42 SNP loci were successfully transformed into KASP markers. The resulting transcriptome data on grasspea have been uploaded to the National Center for Biotechnology Information (NCBI) database. The newly discovered SSR and SNP markers will be useful for genetic improvement of grasspea through breeding.

Statements

Author contributions

XH, TY, JC, and XZ conceived and designed the experiment. XH, TY, YW, RL, and HZ performed the experiment. TY, XH, RL, YY, GR, and DW analyzed the RNA-Seq data. XH, TY, and RL wrote the manuscript. JH, MB, JC, and XZ revised the manuscript.

Funding

This work was financed by the funding of China Agriculture Research Systems (CARS-08), Cooperation Research on Collecting Techniques and Practice in Crop Genebank between China and United States of America (2014DFG31860), Program for Special Agricultural Technology Development of Shanxi Academy of Agricultural Sciences (YGG17055), National Infrastructure for Crop Germplasm Resources project from the Ministry of Science and Technology of China (NICGR2016), Program from Ministry of Agriculture of China (2016-X16), Seed Industry Development Project of Shanxi Academy of Agricultural Sciences (2016zyzx41), and also supported by the Agricultural Science and Technology Innovation Program (ASTIP) in CAAS.

Acknowledgments

We would like to sincerely thank Shanghai Personal Biotechnology Co., Ltd. and Beijing Vegetable Research Center, Beijing Academy of Agricultural and Forestry Sciences for the transcriptome sequencing and KASP testing service. We also thank Mrs. Jiali Xie, Miss. Hongyan Guo, and Miss. Xin Song for the SSR markers screening, as well as Mr. Sha Gong for sequencing data explaining.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2017.01873/full#supplementary-material

    Abbreviations

  • β-ODAP

    β-N-oxalyl-L-α, β-diaminopropionic acid

  • BLAST

    basic local alignment search tool

  • eggNOG

    evolutionary genealogy of genes: non-supervised orthologous groups

  • GD

    gene diversity

  • GO

    gene ontology

  • KASP

    kompetitive allele-specific PCR

  • KEGG

    kyoto encyclopedia of genes and genome

  • NCBI

    national center for biotechnology information

  • NGS

    next-generation sequencing

  • PIC

    polymorphic information content

  • ORF

    open reading frame

  • QTL

    quantitative trait loci

  • SGR

    stay-green

  • SRA

    sequence read archive

  • UPGMA

    unweighted pair-group method with arithmetic means.

References

  • 1

    AlmeidaN. F.LeitaoS. T.KrezdornN.RotterB.WinterP.RubialesD.et al. (2014). Allelic diversity in the transcriptomes of contrasting rust-infected genotypes of Lathyrus sativus, a lasting resource for smart breeding. BMC Plant Biol.14:376. 10.1186/s12870-014-0376-2

  • 2

    BennettM. D.LeitchI. J. (2012). Angiosperm DNA C-Values Database. (release 6.0, Dec.2012 ed). London.

  • 3

    ChowdryM. A.SlinkardA. E. (1997). Natural outcrossing in grasspea. J. Heredity88, 154–156. 10.1093/oxfordjournals.jhered.a023076

  • 4

    ConesaA.GötzS.GarcíagómezJ. M.TerolJ.TalónM.RoblesM. (2005). Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics21, 3674–3676. 10.1093/bioinformatics/bti610

  • 5

    DixitG. P.PariharA. K.BohraA.SinghN. P. (2016). Achievements and prospects of grass pea (Lathyrus sativus L.) improvement for sustainable food production. Crop J.4, 407–416. 10.1016/j.cj.2016.06.008

  • 6

    EnnekingD. (2011). The nutritive value of grasspea (Lathyrus sativus) and allied species, their toxicity to animals and the role of malnutrition in neurolathyrism. Food Chem. Toxicol.49, 694–709. 10.1016/j.fct.2010.11.029

  • 7

    FleuryD.WhitfordR. (2014). Crop Breeding Methods and Protocols.New York, NY: Humuna Press.

  • 8

    GötzS.GarcíagómezJ. M.TerolJ.WilliamsT. D.NagarajS. H.NuedaM. J.et al. (2008). High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res.36, 3420–3435. 10.1093/nar/gkn176

  • 9

    GravesH.RayburnA. L.GonzalezhernandezJ. L.NahG.KimD. S.LeeD. K. (2016). Validating DNA polymorphisms using KASP assay in prairie cordgrass (Spartina pectinata Link) populations in the U.S. Front. Plant Sci.6:1271. 10.3389/fpls.2015.01271

  • 10

    HallT. A. (1999). BioEdit: a user-friendly biological sequence alignment editor and analysis program for windows 95/98/NT. Nucl. Acids. Symp. Ser. 41, 95–98.

  • 11

    HillocksR. J.MaruthiM. N. (2012). Grass pea (Lathyrus sativus): Is there a case for further crop improvement?Euphytica186, 647–654. 10.1007/s10681-012-0702-4

  • 12

    HradilováI.TrnenýO.VálkováM.CechovaM.JanskáA.KhanA. W.et al. (2017). A combined comparative transcriptomic, metabolomic and anatomical analyses of two key domestication traits: pod dehiscence and seed dormancy in pea (Pisum sp.). Front. Plant Sci.8:542. 10.3389/fpls.2017.00542

  • 13

    JiangJ.SuM.ChenY.GaoN.JiaoC.SunZ.et al. (2013). Correlation of drought resistance in grass pea (Lathyrus sativus) with reactive oxygen species scavenging and osmotic adjustment. Biologia68, 231–240. 10.2478/s11756-013-0003-y

  • 14

    JunT. H.VanK.KimM. Y.LeeS. H.WalkerD. R. (2008). Association analysis using SSR markers to find QTL for seed protein content in soybean. Euphytica162, 179–191. 10.1007/s10681-007-9491-6

  • 15

    KheraP.UpadhyayaH. D.PandeyM. K.RoorkiwalM.SriswathiM.JanilaP.et al. (2013). Single nucleotide polymorphism–based genetic diversity in the reference set of peanut (Arachis spp.) by developing and applying cost-effective kompetitive allele specific polymerase chain reaction genotyping assays. Plant Genome6, 1–11. 10.3835/plantgenome2013.06.0019

  • 16

    KlepadloM.ChenP.ShiA.MasonR. E.KorthK. L.SrivastavaV. (2017). Single nucleotide polymorphism markers for rapid detection of the Rsv4 locus for soybean mosaic virus resistance in diverse germplasm. Mol. Breed.37:10. 10.1007/s11032-016-0595-3

  • 17

    KoboldtD. C.ZhangQ.LarsonD. E.DongS.MclellanM. D.LingL.et al. (2012). VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res.22, 568. 10.1101/gr.129684.111

  • 18

    KumarS.BejigaG.AhmedS.NakkoulH.SarkerA. (2011). Genetic improvement of grass pea for low neurotoxin (β-ODAP) content. Food Chem. Toxicol.49, 589–600. 10.1016/j.fct.2010.06.051

  • 19

    LiH.HandsakerB.WysokerA.FennellT.RuanJ.HomerN.et al. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics25, 2078–2079. 10.1093/bioinformatics/btp352

  • 20

    LioiL.GalassoI. (2013). Development of genomic simple sequence repeat markers from an enriched genomic library of grass pea (Lathyrus sativus L.). Plant Breed.132, 649–653. 10.1111/pbr.12093

  • 21

    ListerD. L.JonesH.JonesM. K.O'SullivanD. M.CockramJ. (2013). Analysis of DNA polymorphism in ancient barley herbarium material: validation of the KASP SNP genotyping platform. Taxon62, 779–789. 10.12705/624.9

  • 22

    MammadovJ.ChenW.MingusJ.ThompsonS.KumpatlaS. (2012). Development of versatile gene-based SNP assays in maize (Zea mays L.). Mol. Breed.29, 779–790. 10.1007/s11032-011-9589-3

  • 23

    MichaelJ. T. (2014). High-throughput SNP genotyping to accelerate crop improvement. Plant Breed. Biotech. 2, 195–212. 10.9787/PBB.2014.2.3.195

  • 24

    NeelamK.Brown-GuediraG.HuangL. (2013). Development and validation of a breeder-friendly KASPar marker for wheat leaf rust resistance locus Lr21. Mol. Breed.31, 233–237. 10.1007/s11032-012-9773-0

  • 25

    NeiM.RoychoudhuryA. K. (1974). Sampling variances of heterozygosity and genetic distance. Genetics76, 379–390.

  • 26

    PattoM. C. V.SkibaB.PangE. C. K.OchattS. J.LambeinF.RubialesD. (2006). Lathyrus improvement for resistance against biotic and abiotic stresses: from classical breeding to marker assisted selection. Euphytica147, 133–147. 10.1007/s10681-006-3607-2

  • 27

    PiwowarczykB.TokarzK.KaminskaI. (2016). Responses of grass pea seedlings to salinity stress in in vitro culture conditions. Plant Cell Tissue Organ Cul.124, 227–240. 10.1007/s11240-015-0887-z

  • 28

    RahmanM. M.KumarJ.RahmanM. A.AfzalM. A. (1995). Natural outcrossing in Lathyrus sativus L. Indian J. Genet.55, 204–207.

  • 29

    RozenS.SkaletskyH. (2000). Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol.132, 365–386. 10.1385/1-59259-192-2:365

  • 30

    RybinskiW. (2003). Mutagenesis as a tool for improvement of traits in grasspea (Lathyrus sativus L.). Lathyrus Lathyrism Newsl.3, 27–31.

  • 31

    SatoY.MoritaR.NishimuraM.YamaguchiH.KusabaM. (2007). Mendel's green cotyledon gene encodes a positive regulator of the chlorophyll-degrading pathway. Proc. Natl. Acad. Sci. U.S.A.104, 14169–14174. 10.1073/pnas.0705521104

  • 32

    SemagnK.BabuR.HearneS.OlsenM. (2014). Single nucleotide polymorphism genotyping using Kompetitive Allele Specific PCR (KASP): overview of the technology and its application in crop improvement. Mol. Breed.33, 1–14. 10.1007/s11032-013-9917-x

  • 33

    SinghB. D.SinghA. K. (2015). Marker-Assisted Plant Breeding: Principles and Practices.New Delhi: Springer Press.

  • 34

    SorenK. R.YadavA.PandeyG.GangwarP.PariharA. K.BohraA.et al. (2015). EST-SSR analysis provides insights about genetic relatedness, population structure and gene flow in grass pea (Lathyrus sativus). Plant Breed.134, 338–344. 10.1111/pbr.12268

  • 35

    SunX. L.YangT.GuanJ. P.MaY.JiangJ. Y.CaoR.et al. (2012). Development of 161 novel EST-SSR markers from Lathyrus sativus (Fabaceae). Am. J. Bot.99, e379–e390. 10.3732/ajb.1100346

  • 36

    WangF.YangT.BurlyaevaM.LiL.JiangJ.FangL.et al. (2015). Genetic diversity of grasspea and its relative species revealed by SSR markers. PLoS ONE10:e0118542. 10.1371/journal.pone.0118542

  • 37

    YanZ. Y.SpencerP. S.LiZ. X.LiangY. M.WangY. F.WangC. Y.et al. (2006). Lathyrus sativus (grass pea) and its neurotoxin ODAP. Phytochemistry67, 107–121. 10.1016/j.phytochem.2005.10.022

  • 38

    YangT.JiangJ.BurlyaevaM.HuJ.CoyneC. J.KumarS.et al. (2014). Large-scale microsatellite development in grasspea (Lathyrus sativus L.) an orphan legume of the arid areas. BMC Plant Biol.14:65. 10.1186/1471-2229-14-65

  • 39

    ZhaoW.ChoG. T.MaK. H.ChungJ. W.GwagJ. G.ParkY. J. (2010). Development of an allele-mining set in rice using a heuristic algorithm and SSR genotype data with least redundancy for the post-genomic era. Mol. Breed.26, 639–651. 10.1007/s11032-010-9400-x

  • 40

    ZhouL.ChengW.HouH.PengR.HaiN.BianZ.et al. (2016). Antioxidative responses and morpho-anatomical alterations for coping with flood-induced hypoxic stress in Grass Pea (Lathyrus sativus L.) in comparison with Pea (Pisum sativum). J. Plant Growth Regul.35, 1–11. 10.1007/s00344-016-9572-7

  • 41

    ZietkiewiczE.RafalskiA.LabudaD. (1994). Genome fingerprinting by simple sequence repeat (SSR)-anchored polymerase chain reaction amplification. Genomics20, 176–183. 10.1006/geno.1994.1151

Summary

Keywords

RNA-Seq, Lathyrus sativus, grasspea, SSR, SNP, KASP, genetic diversity

Citation

Hao X, Yang T, Liu R, Hu J, Yao Y, Burlyaeva M, Wang Y, Ren G, Zhang H, Wang D, Chang J and Zong X (2017) An RNA Sequencing Transcriptome Analysis of Grasspea (Lathyrus sativus L.) and Development of SSR and KASP Markers. Front. Plant Sci. 8:1873. doi: 10.3389/fpls.2017.01873

Received

13 July 2017

Accepted

13 October 2017

Published

31 October 2017

Volume

8 - 2017

Edited by

Petr Smýkal, Palacký University Olomouc, Czechia

Reviewed by

Aamir W. Khan, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), India; Kevin E. McPhee, Montana State University, United States

Updates

Copyright

*Correspondence: Jianwu Chang

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

†These authors have contributed equally to this work.

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics