A Combined Comparative Transcriptomic, Metabolomic, and Anatomical Analyses of Two Key Domestication Traits: Pod Dehiscence and Seed Dormancy in Pea (Pisum sp.)

The origin of the agriculture was one of the turning points in human history, and a central part of this was the evolution of new plant forms, domesticated crops. Seed dispersal and germination are two key traits which have been selected to facilitate cultivation and harvesting of crops. The objective of this study was to analyze anatomical structure of seed coat and pod, identify metabolic compounds associated with water-impermeable seed coat and differentially expressed genes involved in pea seed dormancy and pod dehiscence. Comparative anatomical, metabolomics, and transcriptomic analyses were carried out on wild dormant, dehiscent Pisum elatius (JI64, VIR320) and cultivated, indehiscent Pisum sativum non-dormant (JI92, Cameor) and recombinant inbred lines (RILs). Considerable differences were found in texture of testa surface, length of macrosclereids, and seed coat thickness. Histochemical and biochemical analyses indicated genotype related variation in composition and heterogeneity of seed coat cell walls within macrosclereids. Liquid chromatography–electrospray ionization/mass spectrometry and Laser desorption/ionization–mass spectrometry of separated seed coats revealed significantly higher contents of proanthocyanidins (dimer and trimer of gallocatechin), quercetin, and myricetin rhamnosides and hydroxylated fatty acids in dormant compared to non-dormant genotypes. Bulk Segregant Analysis coupled to high throughput RNA sequencing resulted in identification of 770 and 148 differentially expressed genes between dormant and non-dormant seeds or dehiscent and indehiscent pods, respectively. The expression of 14 selected dormancy-related genes was studied by qRT-PCR. Of these, expression pattern of four genes: porin (MACE-S082), peroxisomal membrane PEX14-like protein (MACE-S108), 4-coumarate CoA ligase (MACE-S131), and UDP-glucosyl transferase (MACE-S139) was in agreement in all four genotypes with Massive analysis of cDNA Ends (MACE) data. In case of pod dehiscence, the analysis of two candidate genes (SHATTERING and SHATTERPROOF) and three out of 20 MACE identified genes (MACE-P004, MACE-P013, MACE-P015) showed down-expression in dorsal and ventral pod suture of indehiscent genotypes. Moreover, MACE-P015, the homolog of peptidoglycan-binding domain or proline-rich extensin-like protein mapped correctly to predicted Dpo1 locus on PsLGIII. This integrated analysis of the seed coat in wild and cultivated pea provides new insight as well as raises new questions associated with domestication and seed dormancy and pod dehiscence.

The origin of the agriculture was one of the turning points in human history, and a central part of this was the evolution of new plant forms, domesticated crops. Seed dispersal and germination are two key traits which have been selected to facilitate cultivation and harvesting of crops. The objective of this study was to analyze anatomical structure of seed coat and pod, identify metabolic compounds associated with water-impermeable seed coat and differentially expressed genes involved in pea seed dormancy and pod dehiscence. Comparative anatomical, metabolomics, and transcriptomic analyses were carried out on wild dormant, dehiscent Pisum elatius (JI64, VIR320) and cultivated, indehiscent Pisum sativum non-dormant (JI92, Cameor) and recombinant inbred lines (RILs). Considerable differences were found in texture of testa surface, length of macrosclereids, and seed coat thickness. Histochemical and biochemical analyses indicated genotype related variation in composition and heterogeneity of seed coat cell walls within macrosclereids. Liquid chromatography-electrospray ionization/mass spectrometry and Laser desorption/ionization-mass spectrometry of separated seed coats revealed significantly higher contents of proanthocyanidins (dimer and trimer of gallocatechin), quercetin, and myricetin rhamnosides and hydroxylated fatty acids in dormant compared to non-dormant genotypes. Bulk Segregant Analysis coupled to high throughput RNA sequencing resulted in identification of 770 and 148 differentially expressed genes between dormant and non-dormant seeds or dehiscent and indehiscent pods, respectively. The expression of 14 selected dormancy-related genes was studied by qRT-PCR. Of these, expression pattern of four genes: porin (MACE-S082), peroxisomal membrane PEX14-like protein (MACE-S108), 4-coumarate CoA ligase (MACE-S131), and UDP-glucosyl transferase (MACE-S139) was in agreement in all four genotypes with Massive analysis of cDNA Ends (MACE) data. In case of pod dehiscence, the analysis of two candidate genes (SHATTERING and SHATTERPROOF) and three out of 20 MACE identified genes (MACE-P004, MACE-P013, MACE-P015) showed down-expression in dorsal and ventral pod suture of indehiscent genotypes. Moreover, MACE-P015, the homolog of peptidoglycan-binding domain or proline-rich extensin-like protein mapped correctly to predicted Dpo1 locus on PsLGIII. This integrated analysis of the seed coat in wild and cultivated pea provides new insight as well as raises new questions associated with domestication and seed dormancy and pod dehiscence.

INTRODUCTION
The origin of the agriculture was one of key points in human history, and a central part of this was the evolution of new plant forms, domesticated crops (Meyer et al., 2012;Fuller et al., 2014). The transformation of wild plants into crop plants can be viewed as an accelerated evolution, representing adaptations to cultivation and human harvesting, accompanied by genetic changes (Lenser and Theißen, 2013;Olsen and Wendel, 2013;Shi and Lai, 2015). Common set of traits have been recorded for unrelated crops (Hammer, 1984;Zohary and Hopf, 2000;Lenser and Theißen, 2013). These include loss of germination inhibition and loss of natural seed dispersal (Fuller and Allaby, 2009). The identity of some responsible genes has been revealed (reviewed in Meyer and Purugganan, 2013) through association mapping and genome sequencing, for example in soybean (Zhou et al., 2015), chickpea Kujur et al., 2015), and common bean (Schmutz et al., 2014).
Members of the Fabaceae family have been domesticated in parallel with cereals (Smartt, 1990;Zohary and Hopf, 2000) or possibly even earlier (Kislev and Bar-Yosef, 1988) resulting in largest number of domesticates per plant family (Smýkal et al., 2015). Despite of crucial position of legumes, as protein crops, in human diet as well as crop rotation systems (Foyer et al., 2016), comparably little is known on their domestication. Pea (Pisum sativum L.) is one of the world's oldest domesticated crops and is still globally important grain legume crop (Smýkal et al., 2012(Smýkal et al., , 2015. Experimental cultivation of wild peas have demonstrated that both seed dormancy and pod dehiscence cause poor crop establishment via reduced germination as well as dramatic yield losses via seed shattering (Abbo et al., 2011). The loss of fruit shattering has been under selection in the most seed crops, to facilitate seed harvest (Fuller and Allaby, 2009;Purugganan and Fuller, 2009), while in wild plants, shattering is a fundamental trait to assure seed dispersal (Bennett et al., 2011). Orthologous genes and functions were found to be conserved for seed shattering mechanisms between mono and dicotyledonous plants (Konishi et al., 2006). Recently, two genes have been identified to be involved in pod dehiscence in soybean. One of them is the dirigent-like protein (Pdh1) promoting pod dehiscence by increasing the torsion of dried pod walls, which serves as a driving force for pod dehiscence under low humidity (Funatsuki et al., 2014). The functional gene Pdh1 was highly expressed in the lignin-rich inner sclerenchyma of pod walls. Yet, another NAC family gene SHATTERING1-5 (Dong et al., 2014) activates secondary wall biosynthesis and promotes the significant thickening of fiber cap cells of the pod ventral suture secondary walls. The differences between wild and cultivated soybean is within promoter region and subsequently expression level (Dong et al., 2014).
Timing of seed germination is one of the key steps in plant life. Seed dormancy is considered as a block to the completion of germination of an intact viable seed under favorable conditions (Baskin and Baskin, 2004;Weitbrecht et al., 2011). In the wild, many seeds will only germinate after certain conditions have passed, or after the seed coat is physically disrupted (Bewley, 1997;Baskin et al., 2000;Finch-Savage and Leubner-Metzger, 2006;Bewley et al., 2013). In contrast, crops were selected to germinate as soon as they are wet and planted (Weitbrecht et al., 2011). Moreover, easy seed imbibition has crucial role in cooking ability of most grain legumes. Hence, reducing seed coat thickness led to a concurrent reduction of seed coat impermeability during the domestication (Smýkal et al., 2014). Seed dormancy had played a significant role in evolution and adaptation of plants, as it determines the outset of a new generation (Nonogaki, 2014;Smýkal et al., 2014). A diverse dormancy mechanisms has evolved in keeping with the diversity of climates and habitats (Nikolaeva, 1969;Baskin and Baskin, 2004;Finch-Savage and Leubner-Metzger, 2006). In contrast to hormone mediated seed dormancy extensively studied in Arabidopsis or cereals, we have very little knowledge on physical dormancy, as found in legumes (Baskin and Baskin, 2004;Graeber et al., 2012;Radchuk and Borisjuk, 2014). Although hard-seededness was largely overcome in all domesticated grain legumes except of fodder legumes (Werker et al., 1979;Smartt, 1990;Weeden, 2007), it appears in lentil or soybean depending on the cultivation conditions. Physical seed dormancy is caused by one or more water-impermeable cell layers in seed coat (Baskin et al., 2000;Koizumi et al., 2008;Weitbrecht et al., 2011;Radchuk and Borisjuk, 2014;Smýkal et al., 2014). Numerous transparent testa (tt) and tannin deficient seed (tds) mutants (Appelhagen et al., 2014) indicates the important role of proanthocyanidins and flavonoid pigments in Arabidopsis (Graeber et al., 2012) and Medicago  testa development. In Arabidopsis and Melilotus, seed permeability is altered due to in mutation in extracellular lipid biosynthesis (Beisson et al., 2007). Similarly, in the M. truncatula transcriptomic data set (Verdier et al., 2013a), four of 12 Glycerol-3-phosphate acyltransferases (GPAT) genes were identified as putative orthologs of those reported in soybean (Ranathunge et al., 2010). Furthermore, cells of the outer integument in M. truncatula and pea showed abundant accumulation of polyphenolic compounds; which upon oxidation may impact seed permeability (Marbach and Mayer, 1974;Werker et al., 1979;Moïse et al., 2005). Seed dormancy was identified as monogenic trait in mungbean (Isemura et al., 2012); while six QTLs were detected in yardlong and rice bean (Kongjaimun et al., 2012). In pea, Weeden (2007) has identified two to three loci involved in seed dormancy, via testa thickness and structure of testa surface. Two genes involved in seed coat water permeability were recently identified in soybean. One of them, GmHs1-1, encodes a calcineurinlike metallophosphoesterase transmembrane protein, which is primarily expressed in the Malpighian layer (macrosclereids) of the seed coat and is associated with calcium content (Sun et al., 2015). Independently of this, qHS1, a quantitative trait locus for hardseededness in soybean, was identified as endo-1,4-βglucanase (Jang et al., 2015). This genes seems to be involved in the accumulation of β-1,4-glucan derivatives that reinforce the impermeability of seed coats in soybean. Interestingly, both genes are positioned closely to each other of soybean chromosome 2.
Development of pea and particularly model legume Medicago truncatula seeds have been well-characterized at anatomical (Hedley et al., 1986;Wang and Grusak, 2005) and also transcriptomic and proteomic levels (Gallardo et al., 2007;Verdier et al., 2013a). RNA sequencing (RNA-seq) was used to study changes in gene expression, including M. truncatula (Benedito et al., 2008), Medicago sativa , soybean (Severin et al., 2010;Patil et al., 2015), faba bean (Kaur et al., 2012), Lotus japonicus (Verdier et al., 2013b) and chickpea (Pradhan et al., 2014). In pea, transcriptome studies involved vegetative tissues (Franssen et al., 2011), including pods and seeds (Kaur et al., 2012;Duarte et al., 2014;Liu et al., 2015;Sudheesh et al., 2015), and nodules (Zhukov et al., 2015). Seed coat transcriptome of pea cultivars was analyzed in relation to proanthocyanidin pathway (Ferraro et al., 2014) and seed aging (Chen et al., 2013). Moreover, there is pea RNA-seq gene atlas for 20 cDNA libraries including different developmental stages and nutritive conditions (Alves-Carvalho et al., 2015). Comparative transcriptomics study in relation to domestication trait was conducted recently by Zou et al. (2015) in relation to glume and threshing in wheat. Some of the down-regulated genes in domesticated wheat were related to the biosynthetic pathways that apparently define the mechanical strength of the glumes, such as cell wall, lignin, pectin, and wax biosynthesis. Several of so far identified genes underlying key domestication traits (reviewed in Meyer and Purugganan, 2013) are regulated at transcriptional level with altered spatial and temporal expression, such as seed-shattering (qSH1) locus disrupting the development of the abscission zone between grains and pedicles in rice (Konishi et al., 2006) or teosinte branched (tb1) gene causing single stem growth in maize crop (Doebley et al., 1997).
In the present study, we used comparative transcriptomic, anatomical, and metabolite analysis to detect the differences in gene expression, seed coat structure, and metabolites composition between wild and domesticated pea seed coats in relation to one of the two key domestication traits: seed coat and transcriptomic and anatomical analyses of pod dehiscence.

Plant Material
Four parental genotypes included wild P. elatius JI64 from Turkey and cultivated Afghan landrace P. sativum JI92 both from John Innes Pisum Collection (Norwich, UK); wild P. elatius VIR320 (Bogdanova et al., 2012) (North et al., 1989) were used to establish respective phenotypically contrasting (dormant vs. non-dormant, dehiscent vs. indehiscent) bulks. P. elatius VIR320 differs from other wild peas in relation to the absence of gritty and testa pigmentation, possibly as the result of being either semi-domesticate or hybrid between wild and cultivated pea with unknown origin (Bogdanova et al., 2012).

Seed Water Uptake and Germination Assays
The seeds of four parental and 126 RILs of F 6 generation of JI64 (wild) × JI92 (cultivated) and reciprocal (RILs) were harvested from glasshouse grown plants (February-May 2015). Twenty five seeds per line were incubated in petri dishes (9 cm diameter) over two layers of medium speed qualitative filter papers (Whatman, grade 1) wetted with 3 ml of tap water and incubated in a 25 • C incubator with darkness. Imbibition was scored at 24 h intervals based on changes in seed swelling and germination was determined based on the radicle breaking through seed coat. The percentage, Mean germination time (MGT), Timson index (TI), and Coefficient of Velocity (CV) were calculated over 7 days period. We have used these various mathematical measurements in order to more precisely describe germination process as shown by Ranal and Santana (2006).

Determination of Pod Dehiscence
Pod dehiscence was measured either by direct observation of the pods on the plant or by drying harvested pods at room temperature (Weeden et al., 2002;Weeden, 2007). In case of parental lines JI92 (domesticated, indehiscent pod) and JI64 (wild, dehiscent pod) are dehiscence/indehiscence obvious after pod maturing. On the other hand evaluation of RILs was difficult in some cases, as slight pressure on pods by fingers is necessary for opening. If slight pressure was enough to complete fruit opening the line was evaluated as dehiscent, if not this line was evaluated as indehiscent.

Liquid Chromatography-Electrospray Ionization/Mass Spectrometry (LC/ESI-MS) Analysis
Testa was separated from the rest of the seed, crushed, and extracted using mixture of acetone:water (70:30, v/v) with addition of 0.1% ascorbic acid to achieve efficient extraction of polyphenolic compounds in wide range of polarity and structural diversity (adapted from Amarowicz et al., 2009). 0.5 ml of extract was dried under a stream of nitrogen and solid residue was dissolved in 0.5 ml of methanol. The samples were then analyzed by ultra-performance liquid chromatograph Acquity UPLC I-Class coupled to high resolution tandem mass spectrometer Synapt G2-S with ion mobility separation capability (Waters, Milford, USA). Chromatographic column Raptor ARC-18 (100 × 2.1 mm, dp = 2.7 µm, Restek) and mobile phases (MP) A: water + 0.1% formic acid, B: acetonitrile + 0.1% formic acid was used for separation of components present in seed coat extracts. Flow rate of mobile phase 0.2 ml/min was applied. Electrospray was used as ion source. Spray voltage 2.5 kV in positive and 1.5 kV in negative ion mode, were used, respectively. Process of the LC/ESI-MS method optimization and detailed setup of mass spectrometer will be provided in Válková et al. (in preparation).

Laser Desorption/Ionization-Mass Spectrometry (LDI-MS) Analysis
Seeds of each genotype/line were mechanically disrupted and the seed coats were separated and pooled (four seeds per genotype). Description of the studied RILs is given in Table S1. Small pieces ∼2 mm were fixed on MALDI plate using a common double sided adhesive tape. The samples were analyzed directly without application of a matrix. The prepared samples were analyzed using high resolution tandem mass spectrometer Synapt G2-S (Waters) equipped with vacuum MALDI ion source. For desorption/ionization a 350 nm 1 kHz Nd:YAG solid state laser was used. Details of LDI-MS setup and analytical parameters of hydroxylated fatty acids can be found in Cechová et al. (in preparation).

Metabolite Data Treatment
The obtained LC/ESI-MS and LDI-MS data were processed by MarkerLynx XS a software extension of MassLynx platform (Waters). The processed data matrix, i.e., after extraction, normalization and alignment of retention times (in case of LC/ESI-MS data), m/z-values and intensities of signals, were transferred to Extended Statistics (XS) module, EZinfo (Umetrics, Malmo, Sweden), and studied by principal component analysis (PCA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA). Both PCA and OPLS-DA were used for reduction of data dimensionality. OPLS-DA is a multivariate statistical method employing latent variable regression developed as an extension of more frequently used partial least squares method (Trygg and Wold, 2002). Coordinates of particular samples and RT_m/z pairs (or m/zvalues in the case of LDI-MS data) in appropriate biplots and S-plots were used for evaluation of dormant and non-dormant genotypes mutual segregation and significance of detected signals of metabolites. The procedure was adopted and modified from Kučera et al. (2017). The most significant markers were further studied by targeted MS/MS experiments to reveal their identity (Cechová et al., in preparation; Válková et al., in preparation).

RNA Isolation
For Massive analysis of cDNA Ends (MACE) each sample of parental genotype was composed by several pooled developmental stages of seed coat (2, 3, 4 weeks and older) as it is unknown at which stage putative candidate genes are expressed. Seven selected RILs forming each of dormant resp. non-dormant bulk were previously (at F 6 generation) and after mature seed harvest (of F 7 generation used for RNA isolation) tested for germination behavior (Table S1). Seed coat were dissected under stereomicroscope, immediately frozen in liquid nitrogen and stored at −70 • C until use. Frozen seed coats or dorsal and ventral sutures of pods were ground to a fine powder with liquid nitrogen using sterile mortars and pestles. Total RNA was isolated from seed coat (∼100 mg) using the BioTeke Plant Total RNA Extraction Kit (China) or NucleoSpin RNA Plant kit (Macherey Nagel) for pods, according to the manufacturer's instructions. Yield/quantity and purity was determined by using NanoDrop 2000 spectrophotometer (Thermo Scientific) and diluted in DEPC-H 2 O to 100 ng/µl. Isolated RNAs were treated with DNase according to Baseline-ZERO TM DNase protocol (Epicenter). In case of parental genotypes (JI64, JI92, Cameor, and VIR320) four consecutive developmental stages of seeds (14-37 DPA) were taken each represented by 1.25 µg of total RNA. The RIL bulks were made of 1.425 µg of total RNA of each of seven lines. In case of RNA samples used for pod dehiscence study, two parental lines (JI64 and JI92) and two bulks of contrasting RILs (with dehiscent or indehiscent pods) were used. The bulk of dehiscent RILs was established from eight and bulk of indehiscent RILs from five lines using excised pod sutures of 10 and 20 days after flowering. Each of these four final samples contained ∼1 µg of total RNA each. The integrity of the RNA samples was examined with an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, USA).

Massive Analysis of cDNA Ends (MACE)
MACE libraries were generated using GenXPro's MACE kit (GenXPro GmbH, Frankfurt, Germany) as described in Zawada et al. (2014). Briefly, cDNA from 5 µg of total RNA was randomly fragmented and biotinylated 3 ′ ends were captured after binding to a streptavidin matrix. A library ready for high-throughput sequencing was prepared using TrueQuant adapters included in the kit. The library consisted of 50-700 bp-long fragments derived from the 3 ′ -end of the cDNAs. The 5 ′ -ends of the libraries were sequenced on a HiSeq 2000 machine (Illumina) with 100 cycles to generate the MACE tags, each tag representing one single transcript molecule. In total, 6 cDNA libraries were prepared and sequenced for seed dormancy, while 4 libraries for parents and two contrasting bulks for pod dehiscence, each providing over 10 million reads (Table S2).

Bioinformatics
After sequencing the reads are in 5 ′ -3 ′ orientation. To remove PCR-bias, all duplicate reads detected by the TrueQuant technology were removed from the raw datasets. Low quality sequence-bases were removed by the software Cutadapt (https:// github.com/marcelm/cutadapt/) and poly(A)-tails were clipped by an in-house python-script. The reads were aligned to reference sequences using Novoalign (http://www.novocraft.com). This tool maps reads to reference sequences depending on certain parameters (i.e., quality) and calculates thresholds for each assignment. The reference sequences consisted of all Pisum mRNA sequences from NCBI. We annotated these sequences to all Fabaceae proteins from Uniprot "http://www.uniprot.org/" by BLASTX to Swissprot ("sp|.., " good annotation) and afterwards to Trembl ("tr|..., " less good annotation) protein sequences. All reads that could not be mapped to Pisum mRNA sequences from NCBI were used for a de novo assembly to generate contigs denoted as "noHitAssembly_xxx" and annotated in the same way as the Pisum mRNA sequences from NCBI. Normalization and test for differential gene expression between the bulks were calculated using the DEGSeq R/Bioconductor package . Differential gene expression was quantified as the log 2 ratio of the normalized values between two libraries (log 2 FC). The p-value and correction for multiple testing with the Benjamini-Hochberg false discovery rate (FDR) were computed due to determining significance of gene expression differences in pairwise comparisons of libraries. Lists of Differentially Expressed Genes (DEGs) for three comparisons of contrasting phenotype (dormancy: wild × cultivated, RILs and their parents, dehiscence: RILs and their parents) were made based on combination of pairwise comparisons of log 2 FC ratio of the normalized values (log 2 FC > 2, log 2 FC < −2) and FDR (FDR < 0.01) between all libraries of these groups.

GO and KEGG Annotation
Gene Ontology (GO) enrichment analysis and normalized gene expression data were used to identify function and relationships of differentially expressed genes (Young et al., 2010). The results of the GO analysis were then exported into the Blast2go for the final annotations. The annotations provided the fragments with blast hit with the appropriate gene ontology terms which were classified into three categories: biological process (BP), cellular components (CC), and molecular function (MF). The DEGs were subjected for their presence in the different Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The various enzyme activities and the DEGs involved in the KEGG pathways were revealed for each of the combinations.

Genetic Mapping of Dehiscence Specific SNPs-Methodology
Transcripts containing SNP with at least five reads in both samples that are homozygous distributed in dehiscent vs. indehiscent RILs bulks with only one false allele read in 100 in either bulks were considered dehiscence specific. SNPs were discovered using Joint-SNV-Mix (Roth et al., 2012). The output given by Joint-SNV-Mix was furthermore processed by GenXPro in-house software to filter the SNPs. A minimum coverage of 10 bp was needed to be identified as an SNP. To identify the genomic localization of the SNP the surrounding region of the SNP was assigned per blastn to the genome of M. truncatula "JCVI.Medtr.v4.20130313" from http://jcvi.org/medicago. The "snpviewer" a webtool from the "http://tools.genxpro.net" was used to visualize the data.

Real-Time Quantitative Reverse Transcription PCR
Gene-specific oligonucleotide primers were designed (Table S3) based on MACE consensus sequences using the FastPCR software (Kalendar et al., 2014). The expression of selected candidate genes was validated by quantitative real time PCR (qRT-PCR). RNA samples (treated with DNase) were reverse-transcribed with Oligo(dT) 15 primer (Promega) in a two steps reaction in final volume 40 µl. The qRT-PCR analysis was run on the CFX96 TM Real-Time Detection System (Bio-Rad) using the SensiFast SYBR R No-ROX kit (Bioline) or LightCycler R 480 SYBR Green I Master kit (Roche) in case of pod dehiscence study. Primers were designed using FastPCR or Oligo Primer Analysis Software (Molecular Biology Insights, USA) and produced amplicons ranging from 77 to 220 bp (Table S3). Every PCR reaction included 2 µl cDNA (1:10 diluted cDNA), 5 µl 2× SensiFAST SYBR mix or LightCycler R 480 SYBR and 400 nM of each primer in final volume 10 µl. The expression was studied at two developmental stages in four contrasting parental genotypes (JI64, JI92, Cameor, and VIR320) and in case of dehiscence study also contrasting RILs with dehiscent or indehiscent pods. The conditions for PCR were: 95 • C for 2 min; 45 cycles of 95 • C for 10 s, 55 • C for 30 s, and 72 • C for 20 s; followed by a melting curve of 65-94 • C (recovered every 0.5 • C held for 0.5 s). The specificity of primers was confirmed by melting curve and gel analysis of products. Quantification of transcript level was determined by CXF Manager Software (Bio-Rad). Actin or β-tubulin gene was used as a reference to normalize relative quantification (Ferraro et al., 2014) using the comparative Ct (2 − Ct ) method. Changes in transcript were estimated as fold change relative to the expression in the genotype Cameor (younger stage) in case of seed dormancy study and genotype JI92 (younger stage) in case of pod dehiscence study.

Anatomical and Germination Differences between Seed Coat of Wild and Cultivated Peas
Wild pea seeds display high level of dormancy mediated by seed coat permeability. Two contrasting parental pairs of wild (P. elatius) JI64 and VIR320 and cultivated (P. sativum) cv. Cameor and JI92 peas were selected as they differ in testa pigmentation, thickness, and dormancy levels as well as pod dehiscence trait. Moreover, RILs were generated from cross of JI64 and JI92 to facilitate mapping. While cultivated pea seeds imbibe readily and germinate within 24 h (JI92, cv. Cameor), wild pea seeds remain highly dormant and imbibe and germinate at 8% (JI64) or 30% levels (VIR320) after 7 days (Figure 1). Mean germination time, Timson index, and Coefficient of Velocity are also very different between respective parental genotypes, 3.4 (MGT), 1 (TI), and 0.29 (CV) for JI92 while being 7, 0.008, and 0.14 for JI64, respectively. RILs displayed more variability in each of the respective measures ( Figure S1), with wide range of percentage of germination (at 7 days) from 4 to 100%, 0.61 to 4.44 CV, 1 to 7 MGT, and 0 to 0.95 TI. All these parameters indicate complexity of imbibition and germination processes as none single is sufficient to fully describe any given line. As shown in Table S1, non-dormant bulk had on average 56% germination over 7 days period, 1.58 CV, 0.36 TI, and 68.02 MGT respective, contrast to dormant bulk lines having 22%, 1.38 CV, 0.28 TI, and 104.71 MGT, respectively. Similarly testa thickness was 107 vs. 135 µm, respectively. Testa thickness was analyzed by micrometric, light microscopy or SEM measurements. Especially JI64 and JI92 differ substantially in palisade cells length, which contributes to overall testa thickness (Figure 2). Dormant genotype JI64 has significantly thicker testa, which might contribute to the water impermeability of seed coat of dormant pea genotypes. There are considerable differences in surface pigmentation and texture of individual lines (Figures 3a,d,g,j). While Cameor is not pigmented visible pigmentation is present in other genotypes with different intensity and localization. The texture of surface is variable among genotypes, particularly in details of macrosclereid tips arrangement defining the surface shape being covered by thin cuticle (Figures 3b,c,e,f,h,i,k,l and Figure S2).
The most obvious is gritty surface of JI64 which was absent in all the other genotypes included in this study. The continuity of surface (cuticle integrity) was interrupted locally by minor fissures in all genotypes. Large fissures developed in seed coat of the non-dormant genotypes Cameor and JI92, mostly in the hilar and strophiolar region later during the imbibition (data not presented). The cytological arrangement of seed coat of tested genotypes varies particularly in macrosclereids. The surface is covered with thin cuticle ( Figure S2) which copies the outer extremities of palisade macrosclereids. Based on light microscopy we did not see any apparent difference in cuticle properties among genotypes as revealed by autofluorescence or Sudan staining ( Figure S2). Interestingly, cuticle is not the only lipidic material localized close to the seed coat surface. Non-cuticular lipidic extracellular material was present also in the very tips of the macrosclereids (Figures S2e-h) containing autofluorescent material (Figures S2a-d). The analysis of seed coat surface was complemented with histochemical analysis of selected compounds of cell wall. Metachromatic staining with toluidine blue of non-dormant genotype Cameor exhibited high level of polyanionic cell wall components, while in contrary non-dormant, but well-pigmented JI92 showed lower abundance of polyanionic compounds similarly to dormant genotypes where metachromasy was mostly limited to macrosclereid tips bellow the cuticle (see Figures S2a-d). Staining with Alcian blue further supports such conclusion (data not shown). Interestingly, metachromatic staining was enhanced with 3% acetic acid pretreatment (Figures S3e-h). Clear connection between presence of tannins and toluidine blue stainability is well-documented in JI92, where deeper staining is present out of pigmented spots (proanthocyanidin positive; Figure S3f). The presence of proanthocyanidin within cell walls of macrosclereids was not detected in non-dormant genotype Cameor but was obvious in JI92 as well as in the dormant genotypes (Figure 4) using both HCl-Vanilin and DMACA tests. Condensed tannin presence was never recorded within the light line (Figure 4) of macrosclereids of any genotype. Macrosclereids of dormant type genotypes seems to be enriched with proanthocyanidins in the cell walls of the entire macrosclereids up to the light line. Seed coat of JI92 is highly enriched with proanthocyanidins only in the dark pigmented spots. Interestingly, the abundance of condensed tannins negatively correlates with toluidine blue stainability and sirofluor staining of cell walls indicating tannin linkages to other compounds within the cell wall. Detected tannins are not extractable with ethanol, or 1M HCl. Alkaline hydrolysis of cell wall bound tannins by 1M sodium hydroxide resulted in loss of tannins stainability (Figures 4c,f). There was intense aniline blue staining of macrosclereids in all genotypes, particularly in the light line of macrosclereids (Figures S2i-l) and their outer part composed in majority of the secondary cell walls. The strongest signal was observed in dormant genotypes, especially in the light line. However, the signal for callose specific antibody did not correspond with aniline blue fluorophore and in general was found rather week and discretely localized (Figure S2i-inlay). Phloroglucinol staining indicative of lignin provided no response in nondormant nor in dormant pea genotypes indicating the absence FIGURE 1 | Cumulative germination percentage of wild P. elatius (JI64, VIR320) and cultivated P. sativum (JI92 and cv. Cameor) seeds tested at 25 • C over the period of 163 h.
FIGURE 2 | Length of the seed coat palisade cells of selected pea genotypes (Cameor, JI92, JI64, and VIR320). Box plot of median with 25th and 75th percentile, whiskers are 5th and 95th percentile; n = 75; each genotype is different from the other (ANOVA p < 0.001). of significant amount of lignins in the testa of analyzed pea genotypes.

Chemical Analysis of Seed Coat Composition
Detection of metabolites present in seed coat related to dormancy was based on comparison of LC/ESI-MS and LDI-MS data of dormant and non-dormant pea genotypes using principal component analysis and orthogonal projection to latent structures. Figure 5 reflects the differences in coordinates of particular genotype samples in corresponding Score plot obtained by Principal Component Analysis of LC/ESI-MS data. Although, individual coordinates do not exhibit statistically significant differences among all the genotypes (e.g., t[2] coordinates of Cameor, Terno, and VIR320), location of each genotypes given by combination of both coordinates provided resolution among particular genotypes. The differences in the coordinates (mutual orientations and values) clearly show the separation of dormant (i.e., L100, JI64, and VIR320) from non-dormant genotypes (i.e., Terno, Cameor, and JI 92) using the acetone:water extract. Separation of L100 and JI64 from non-dormant genotypes is much more significant compared to the separation of VIR 320. This can be explained by possible semi-domesticated status of this genotype. Based on the achieved separation of dormant and non-dormant genotypes by unsupervised Principal Component Analysis (PCA), supervised Orthogonal Projection to Latent Structures (OPLS-DA) was used to find signals mostly responsible for the chemical differences in dormant and non-dormant samples. Those signals (m/z-values of markers with increased intensity in dormant genotypes compared to non-dormant ones) were studied in detail by targeted tandem mass spectrometry (study of their fragmentation after collision induced dissociation in collision cell of mass spectrometer). Details of analytical interpretation can be found in Válková et al. (in preparation). Attention was especially focused on the chemical differences between morphologically the most similar pair of genotypes, i.e., JI 64 and JI 92. Combination of information about retention time, exact mass measurement and fragmentation revealed the identity of the most significant dormancy markers found in acetone-water extracts-dimer and trimer of gallocatechin (m/z 611.1387 and 915.1945, deviation of measured from theoretical m/z-value of parent ion, dtm, -0.8 and -3.3 mDa), respectively, quercetin-3-rhamnoside (m/z 449.1045, dtm -3.3 mDa) and myricetin-3-rhamnoside (m/z 465.1112, dtm 7.9 mDa). Analogously, the chemical differences between dormant JI64 and JI92 genotypes were studied by laser desorption-ionization mass spectrometry (LDI-MS). Measurement in negative ion mode in combination with PCA a OPLS-DA revealed marked differences in the profile of particular hydroxylated long chain fatty acids [i.e., m/z 411.3865, hydroxyhexacosanoate (dtm 2.7 mDa); m/z 425.3927, hydroxyheptacosanoate (dtm −6.8 mDa); m/z 427.3875, dihydroxyhexacosanoate (8.8 mDa); m/z 437.4033, hydroxyoctacosanoate (dtm 3.8 mDa); m/z 441.3973, dihydroxyheptacosanoate (2.9 mDa) and m/z 455.4180, dihydroxyoctacosanoate (dtm 8.0 mDa)]. Two orders of magnitude higher normalized signals of dihydroxyheptacosanoate and dihydroxyoctacosanoate were measured in JI 64 compared to JI 92, i.e., (1.21 ± 0.92).10 −2 and (2.50 ± 1.54).10 −2 vs. (1.34 ± 0.02).10 −4 and (2.19 ± 0.58).10 −4 , respectively. Figure 6 shows the normalized signals of both dihydroxylated long chain fatty acids in seed coats of JI64, JI92, and RILs with respect to dormancy. The majority of dormant RILs exhibit higher content of those fatty acids compared to non-dormant ones.

Seed Coat Transcriptome Differences between Wild and Cultivated Pea
In order to understand the mechanism of testa mediated dormancy we selected seed coat derived from two contrasting parental pairs and two bulks of RILs differing in dormancy level to extract RNA and to identify candidate genes using the next-generation sequencing method of Massive Amplification of cDNA Ends (MACE). The isolation of RNA from wild pea seed coat tissue proved to be very difficult. It is likely that high due to content of free metabolites (oligosaccharides and proanthocyanidins). We assessed the expression patterns in domesticated (cv. Cameor and JI92 landrace) vs. wild P. elatius (JI64, VIR320) pea. Each sample has yielded between 8 and 15  (Table S2). Bioinformatics analysis resulted in identification of 144,000 transcripts (e.g., MACE annotated fragments) expressed in immature seed coat tissue. We have used stringent values of false discovery rate (FDR) ≤ 0.01 and foldchange (log 2 FC) ≥2 as a threshold to identify the significant differences in the gene expression. Applying these criteria, a total of 10,132-11,808 transcripts were found differentially expressed between cultivated (JI92, Cameor) and wild (JI64, VIR320) parents (Table S2). Of these 770 were differentially expressed between all wild vs. cultivated genotypes, of these 374 were up-regulated in cultivated genotypes, and 396 down-regulated ( Figure 7A, Table S4), when annotated to pea transcriptome, respectively.
A heat-map of 1,000 genes with the highest variance among normalized expression between cultivated and wild pea samples ( Figure 8A) further illustrates differences between domesticated and wild pea seed coat expressed genes. This comparison shows the differences between respective pairs (cultivated vs. wild). In addition to contrasting parental genotypes, two bulks of seed coats at several pooled developmental stages of seven dormant and seven non-dormant RILs (Table S1) were analyzed. The bulks were included to minimalize identification of genes associated with respective genetic background rather than dormancy trait. This is clearly shown when DEGs are compared between parents and RIL bulks (having largest number of specific DEGs between parents e.g., 6,204 up-/5,604 downregulated transcripts, while only 869 and 1,014 down-regulated transcripts in RIL non-dormant vs. dormant bulks. In case of gene expression profile of dormancy and non-dormancy RILs and their parents, 299 DEGs were found. When comparison included JI64 and JI92 parents and two respective RIL bulks, there were 83 up-and 216 down-regulated genes ( Figure 7B). In order to visualize the expression pattern of RILs and theirs parents, heatmaps were constructed for 11,808 genes from libraries of dormancy and 6,259 genes from libraries of dehiscence ( Figure 8B).

Enrichment Analysis of DEGs Functional Classes between Wild and Domesticated Pea Seed Coats
In order to investigate transcriptome changes in seed coat associated with evolution under domestication, we assessed the expression patterns of the DEGs in domesticated (cv. Cameor and JI92 landrace) vs. wild P. elatius (JI64, VIR320) pea. We identified 770 DEGs (583 respectively, when ambiguous are removed) is seed coat between wild and domesticated peas. Due to the absence of complete pea genome and likely specificity of seed coat tissue, we could annotate 66% of MACE fragments. Moreover, between 36 and 41% produced ambiguous assignment ( Table S2). For DEGs sequences assigned to GO terms, we observed differences within all three compounds: cellular components, molecular function, and biological process. Several GO groups were found differently expressed. In GO enrichment of DEGs between wild and cultivated the most interesting results belongs to Molecular function group ( Table S5). The most DEGs were found in phenylpropanoid (17 genes in KEGG pathway) and flavonoid (11 genes) biosynthetic pathways (Figures S4, S5) , isomerase (EC:5.5.1.6), 3 ′ -monooxygenase (EC:1.14.13.21), 4-monooxygenase (EC:1.14.13.11) genes (Table S5), respectively. Enzyme 4-coumarate CoA ligase (MACE-S131) catalyzes conversion of 4-coumarate and other derivatives to corresponding esters serving to generate precursors for formation lignin, suberin, flavonoids. In general, main differences in gene expression was detected between enzymes that played important role in secondary metabolites biosynthesis. Different levels of expression were observed for the cellulose synthase enzyme (EC:2.4.1.12) and two enzymes of pectin metabolism pectate lyase (EC:4.2.2.2) and pectin methylesterase (EC:3.1.1.11), which may interfere together with enzymes from the phenylpropanoid and lignin biosynthesis to the structural composition of cell wall. Enzyme 3-hydroxyacyl-CoA (EC:1.1.1.35) dehydrogenase that belongs to fatty acid elongation pathways is also up-regulated in dormant (wild) genotypes. This enzyme participating in of fatty acid biosynthesis showed changes in expression between wild and cultivated genotypes (Figure 6). Out of 548 DEGs, 307 (56%) were assigned to GO-term groups, including 171 (56%) down-regulated and 136 (44%) up-regulated in the domesticated pea seed coat compared to wild samples. As shown in Table S5, the known DEGs were mainly classified into 40 functional categories and involved in 19 biological processes. The results showed that these DEGs mainly distributed in plasma membranes and nucleus after genes expression, and participated in the biological process  of biosynthetic process (60 genes, 12%), metabolism (183 genes, 36%), regulation of transcription (3 genes, 0.5%), transporting (17 genes, 3%), stress response (16 genes, 3%), cell division and differentiation (180 genes, 35%), localization (30 genes, 6%), establishment of localization (28 genes, 6%), lignin synthesis and so on. Through comparative analysis, the two most abundant sub-classes were biosynthesis processes and metabolic processes. The KEGG pathway analysis showed the presence of many genes PsCam051542, PsCam037704, PsCam043296, PsCam000856, PsCam038256, PsCam049689, PsCam016941, PsCam049689, PsCam050533 involved phenylpropanoid biosynthesis. Similarly, PsCam049689, PsCam038227, PsCam005153, and PsCam050665 were found to be associated with flavonoid biosynthesis (Table S5).

Pea Pod Dehiscence
The structure of pea pericarp follows the common arrangement in Fabaceae. The exocarp consists of thick walled epidermis, the relatively thick mesocarp is arranged in several layers of parenchyma and endocarp composed of lignified sclerenchyma on inside of which is thin-walled epidermis. Both exocarp and mesocarp are rich in pectins, as indicated with metachromatic staining of toluidine blue (Figure 11).

Genetic Mapping of Dehiscence Locus
We assume that dehiscent and indehiscent RILs bulks are contrasting for genomic regions which are responsible for dehiscence trait, because of selection for this trait and repeated selfing of RILs lines. On the other hand remaining genomic regions should be represented by polymorphic reads from RILs libraries due to mixing of RILs lines during bulking. When we compared polymorphism in reads from indehiscent RILs bulk and dehiscent RILs bulk three homozygous SNP rich genomic region associated with dehiscent trait were identified due to identification of homozygous SNP which indicated that this region was under selection during RILs lines development ( Figure S6). Based on sequence homology first of them is located in the second half of M. truncatula chromosome 1. Second and third region are more clearer in contrast with first region. Second region is located at the beginning of chromosome 2 where homozygous SNP are concentrated around 2 megabase and third region is situated around 53 megabase at the end of chromosome 3. Others homozygous SNP are spread across all Medicago chromosomes and do not form distinct cluster.

DISCUSSION
Plant domestication process is interesting phenomenon of accelerated human directed evolution. To dissect genetic changes associated with this process, either wild to cultivated crosses and linkage mapping or newly genome wide association mapping are employed to infer on number of genes governing domestication traits. Although some of the genes underlying domestication traits were shown to be regulated at transcriptional level (Doebley et al., 1997;Konishi et al., 2006) limited studies were conducted to investigate transcriptomic changes between wild progenitors and cultivated crops, analyzing pod or seed tissues, such as wheat glumes (Zou et al., 2015). We used comprehensive transcriptomic, metabolomics, and anatomical analyses to compare domesticated and wild pea seed coats and pods in relation to the loss of seed dormancy and pod dehiscence.

Seed Coat Anatomical Structure and Histochemical Properties
Histological analysis of the seed coat in M. truncatula revealed changes in cell wall thickness in the outer integuments throughout seed development (Verdier et al., 2013a). In Arabidopsis and Melilotus (legume), seed permeability was modulated by mutations affecting extracellular lipid biosynthesis (in Verdier et al., 2013a). Similarly, in M. truncatula, cells of the outer integument showed abundant accumulation of polyphenolic compounds (Figure 4); which upon oxidation may impact seed permeability (Moïse et al., 2005). Current knowledge about physical dormancy mainly comes from studies on morphological structure, phenolic content, and cuticle composition in legume species (reviewed in Smýkal et al., 2014). Morphological observation indicated that seed hardness was associated with the structure of palisade and cuticular layer (Vu et al., 2014) and presence or absence of cracks Koizumi et al., 2008). Other authors have proposed that the compositions of carbohydrates, hydroxylated fatty acids, or phenol compounds in seed coats control the level of permeability Xu, 2000, 2001;Shao et al., 2007;Zhou et al., 2010). Mullin and Xu (2001) found that the seed coat of an impermeable genotype had a high concentration of hemicellulose, essentially composed of xylans, which would reduce the hydrophilicity of the seed coat. We have found considerable structural and functional differences in testa properties between wild and domesticated peas. Contrary to Ma et al. (2004), who found small cuticular cracks in soft but not hard seeds of soybean, the surface was similar among used lines with only small discontinuities over the whole surface (Figure 3). However, we cannot exclude that these small fissures resulted from the SEM sample preparation, similarly to the above mentioned work. In the non-dormant genotypes subjected to imbibition, large fissures appear preferentially in the hilar region and strophiole (not shown). However, those are the most likely consequence of embryo imbibition and its volume increase and thus we do not expect those as primary sites of water entrance. There is a lack of detailed description of the primary pathway of water entry in pea as well as in the whole Fabaceae family, although the topic is thoroughly discussed (e.g., Baskin et al., 2000;Meyer et al., 2007;Ranathunge et al., 2010;Smýkal et al., 2014). It is thus not clear whether the hilar or strophiolar region is the primary entrance of water in the non-dormant genotypes as suggested also by McDonald et al. (1988) or Korban et al. (1981) in soybean and common bean (Agbo et al., 1987) or if the minor fissures present in the cuticle and properties of the outer part of palisade macrosclereids make the difference as suggested by Ma et al. (2004). Interesting and generally neglected feature of macrosclereids is a presence of autofluorescent, phenolics containing lipidic material in the terminal caps of macrosclereids above the light line, which is not directly connected with the cuticle (Figure S2). There is no detailed information on the nature of this material or its possible functional significance. Obviously, detailed structure of outer part of the palisade macrosclereids deserves future attention. Dormant genotypes have thicker macrosclereids palisade layer, which might contribute to the water impermeability of coats of dormant pea genotypes as suggested by Miao et al. (2001). However, thickness alone does not necessarily account for water impermeability (de Souza and Marcos-Filho, 2001). Our results also suggest different thickness of palisade layer among the dormant genotypes with the most dormant JI64 having the thickest palisade layer (Figure 2). Metachromatic staining with toluidine blue revealed that the non-dormant non-pigmented genotype Cameor exhibits high level of polyanionic pectins with exposed free carboxyl groups (Figure S3). On the other hand, the non-dormant, but well-pigmented JI92 showed lower abundance of pectins related anions, similarly to pigmented dormant genotypes. There is clear connection between toluidine blue stainability and seed coat pigmentation-anthocyanidin presence. Taken together results from toluidine and Alciane blue with vaniline and DMACA suggest that tannins in the seed coats of pigmented peas are probably bound to other compounds of the cell walls, changing the staining properties of cell walls. Nature of these linkages is unknown, but covalently bound tannins might be indicated as alkaline hydrolysis releases proanthocyanidins from the cell walls (Krygier et al., 1982). Such expectation might be further supported by presence quercetin-3-rhamnoside fragments from LC/MS and MALDI-MS study. The detected increase in metachromatic staining of testa cell walls after weak acid treatment might be indicative of a crosslinking of pectins with other compounds in dormant as well as JI92 genotype, which is released in acid environment. Interestingly, it was reported that condensed tannins might be released from linkages in acid environment (Porter, 1989). We can speculate whether the intensity of pectin-tannin crosslinking is associated with physical dormancy of pea. There are scarce references indicating possible role of tannins in cell wall polymer network and its properties (Pizzi and Cameron, 1986). There is some supposition for callose deposition in the light line area (e.g., de Souza et al., 2012). However, strong staining with aniline blue fluorochrome staining in the upper part of macrosclereids including the light line ( Figure S2) cannot be attributed to callose. Such assumption is consistent with the specific callose antibody localization which led us to the conclusion that the signal for aniline blue is probably signal for one or more other compounds structurally similar to callose. The interaction of aniline blue fluorochrome with other 1,3-or 1,4-β-D-glucans was described by Evans et al. and this interaction depends on the degree of polymerization, and nature of substitution of the 1,3-β-D-glucan chain as well as on the concentration of phosphate in the staining solution (Evans et al., 1984). Deeper anatomical and histochemical analysis of seed coat and more reliable detection of primary entrance point of water during early rehydration phase is needed.

Differentially Expressed Genes during Seed Coat Development and Seed Dormancy
Seed development has been thoroughly studied in number of crops including legumes, especially with focus on embryo development (Bewley et al., 2013). In our study, we have made comparative transcriptomics analysis in order to dissect candidate genes/pathways associated with domestication imposed changes on seed coat properties. Temporal transcriptional changes during seed and pod development were studied in pigeonpea (Pazhamala et al., 2016), soybean (Aghamirzaie et al., 2015;Redekar et al., 2015), Medicago (Gallardo et al., 2007;Benedito et al., 2008;Verdier et al., 2013a;Righetti et al., 2015), peanut (Zhu et al., 2014;Wan et al., 2016), and pea (Liu et al., 2015) seeds but no study was made on comparison of wild progenitor and cultivated crop. Moreover, these studies analyzed either entire seed/pod or developing embryos, while in our work we have used excised seed coat or dissected pod suture. During the RNA isolation from the seed coat tissue, we have experienced great difficulties when working with wild pea seed samples. As reported earlier for pigmented soybean seeds (Wang and Vodkin, 1994) proanthocyanidins binds to RNA and prevent its extraction. We failed when using standard phenol/chloroform (McCarty, 1986) or guanidium thiocyanate (Chomczynski and Sacchi, 1987) methods, as well as common plant tissue RNA isolation kits.
There is problem of ambiguously mapped reads, which are the major source of error in RNA-Seq quantification (Robert and Watson, 2015). Short-read alignment is a complex problem due to the common occurrence of gene families. In contrast to RNA-seq, MACE methodology is derived from 3 ′ UTR end of transcript and each is represented by single molecule. The choice of quantification tool also has large effect, as these also differ in the way they handle aligned data and multimapped/ambiguous reads (Robert and Watson, 2015). In order to exclude or minimalize that identify DEGs are solely due to the genetic differences between contrasting parental genetic stocks we have used phenotypically classified bulks derived from RILs (Table S1, Figure S1). The concept of Bulk segregant analysis (BSA) was established as a method to detect markers in a specific genomic region by comparing two pooled DNA samples of individuals from a segregating population (Michelmore et al., 1991). Coupling BSA with the high throughput RNA sequencing has been shown to be an efficient tool for gene mapping and identification of differentially expressed genes (Chayut et al., 2015;Bojahr et al., 2016). One possible bottleneck of our analysis was bulking of several developmental stages into single MACE sample, where temporal and spatial expression might be hidden, resulting in differences between MACE and qRT-PCR data. Dynamic nature of gene expression both in spatial and temporal levels is clearly seen at qRT-PCR analysis of selected DEGs (Figures 9, 10). The bulking of various developmental stages, moreover from different genotypes (in case of RILs) is the source of imprecision which can result in masking of DEGs. The key to the successful use of BSA is precision of phenotypic assignment. Although some imprecision in phenotypic classification and comparable low number of RILs used for bulking (Table S1), transcriptomics analysis has provided valid results. As shown on heat-map (Figure 8), RIL bulks are indeed genetic mixture of parental genotypes. MACE method (Kahl et al., 2012) detects allele-specific SNPs and indels associated with the defined genotypes that can be instantly used in genetic mapping (Bojahr et al., 2016). As result, there was significant clustering of homozygous SNPs associated with seed dormancy (e.g., respective parental alleles) on Mt chromosomes 3 and 4, or chickpea chromosomes 5 and 7, respectively (not shown). These correspond to pea linkage groups (LG) III and IV. Using identical RIL mapping population and DARTseq markers, we mapped seed coat thickness to LG I, III, IV, and VI (unpublished) and percentage of seed germination to LGII. These indicate that there is likely more than single major gene involved in seed dormancy, acting at different stages (testa thickness, permeability).
Despite that detected DEGs between dormant and nondormant pea seeds belongs to various GO and KEGG pathways, the largest number of annotated ones was found within phenylpropanoid and flavonoid pathways (Figure S4). These are involved in various activities such as UV filtration, fixing atmospheric nitrogen, and protection of cell walls (Zhao et al., 2013). Analysis of soybean mutant defective in seed coat led to identification of differentially expressed proline-rich and other cell wall protein transcripts (Kour et al., 2014). Moreover, this single gene mutation has resulted in differential expression of 1,300 genes, pointing out that complex series of events, many manifested at the transcript level, lead to changes in physiology, and ultimately structure of the cell wall. Similarly, we speculate that gene/-s causative of pea seed dormancy results in complex transcriptional and metabolomics changes. Recently, KNOTTED-like homeobox (KNOXII) gene, KNOX4, was found responsible for the loss of physical dormancy in the mutant Medicago seeds (Chai et al., 2016) resulting in differences in lipid monomer composition. These findings are in agreement with our data obtained by laser desorptionionization mass spectrometric comparative analysis between dormant and non-dormant pea genotypes. Especially long chain hydroxylated fatty acids such as mono and dihydroxylated hexacosanoate, heptacosanoate, and octacosanoate were found in higher concentration in dormant peas compared to non-dormant ones (as shown in Figure 8 for dihydroxyheptacosanoate and dihydroxyoctacosanoate), implying that the presence of a greater proportion of hydroxylated fatty acids may provide a greater interconnectivity of cutin hydrophobic components improving its stability and impermeability for water as discussed also by Shao et al. (2007). As downstream targets of KNOX4 gene, several key genes related to cuticle biosynthesis were identified, such as the cytochrome P450-dependent fatty acid omega-hydroxylase and fatty acid elongase 3-ketoacyl-CoA synthase (Chai et al., 2016). We have not found homologs genes when searched within our DEGs set. It can be hypothesized that different genes have been altered in independently domesticated crops, although possibly acting on identical pathways. There are limited studies combining of transcriptomic and metabolomics analysis (Enfissi et al., 2010) or a combination of both techniques including proteomics (Barros et al., 2010;Collakova et al., 2013). Such integrative approach enables not just to identify transcript and metabolite changes associated with given process, but also to focus on biochemical pathways relevant to studied trait and possibly also delimit candidate genes. As shown in Medicago (Verdier et al., 2013a) and soybean (Ranathunge et al., 2010) the gene expression in seed coat is complex and dynamic. Since we could not currently annotate 10% of detected transcripts, it can be expected that with available pea genome this could be further improved. They might include putative target candidate/-s for seed coat permeability.

Dormant Pea Seed Coat Accumulates More Proanthocyanidins
In legume seeds, there are three parts: the seed coat, the cotyledon, and the embryonic axis which, on average, represent 10, 89, and 1%, respectively, of the seed content. Seed coat pigmentation was shown to correlate with imbibition ability in several legumes, including common bean (Caldas and Blair, 2009), chickpea (Legesse and Powell, 1996), yardlong bean (Kongjaimun et al., 2012), faba bean (Ramsay, 1997), and pea (Marbach and Mayer, 1974;Werker et al., 1979). The presence of proanthocynidins (PAs) in seed coats can be assessed by the appearance of brownish coloration, which is the result of PA oxidation by polyphenol oxidase (Marles et al., 2008). In soybean (Glycine max), the recessive i allele results in high anthocyanin accumulation in the seed coat, resulting in dark brown or even black color (Tuteja et al., 2004;Yang et al., 2010). In contrast, the dominant I allele, which silences chalcone synthase (CHS) expression and hence blocks both anthocyanin and PA biosynthesis, results in a completely colorless seed coat. However, there is not simple relationship between the testa pigmentation imposing dormancy, as numerous cultivated pea varieties have colored testa yet do not display seed dormancy as illustrated by this study used JI92 landrace. Mendel's A gene beside flower color has pleiotropic effect including seed coat pigmentation (Hellens et al., 2010), yet these traits can be decoupled by recombination (Smýkal, unpublished). Second, B gene of pea encodes a defective flavonoid 3 ′ , 5 ′ -hydroxylase, and confers pink flower color, by control of hydroxylation of flavonoid precursors (Moreau et al., 2012). Neither this mutation results in alteration of seed dormancy. Comparably more is known on Arabidopsis and Medicago seed development owing to available mutants. Many of these mutations indicate the important role of proanthocyanidins and flavonoid pigments in testa development (Graeber et al., 2012) including effect on seed dormancy. The proanthocyanidins (PAs) received particular attention due to their abundance in seed coats (Dixon et al., 2005;Zhao et al., 2010) including pea (Ferraro et al., 2014). PAs are also known as the chemical basis for tannins, which are considered to be important part of physical dormancy in some species (Kantar et al., 1996;Ramsay, 1997). Flavan-3-ol-derived PA oligomers and anthocyanins are derived from the same precursors, proanthocyanidins (Lepiniec et al., 2006) and chemical diversity is introduced early in the pathway by cytochrome P450 enzymes (reviewed in Li et al., 2016a). Anthocyanidin synthase, anthocyanidin reductase, and leucoanthocyanidin reductase were studied at transcriptional level by Ferraro et al. (2014) in cultivated pea varieties and showed to be developmentally regulated. In our comparative transcriptome profiling we have not found any of these genes to be among DEGs, suggesting that differences in metabolites (quercetin, gallocatechin) found by chemical analysis are not at these steps of PA biosynthesis. Anthocyanidins are either immediately modified by glycosylation to give anthocyanins by anthocyanidin 3-O-glycosyltransferases (UGTs) or reduced to generate flavan-3-ols (such as epicatechin) by anthocyanidin reductase for PA biosynthesis (Xie et al., 2003(Xie et al., , 2004). There are several described Medicago mutants defective in respective genes, resulting in reduced testa pigmentation (Li et al., 2016b), although the relationship to seed dormancy was not specifically investigated. Indeed, we have detected several differentially expressed UDP-glycosylases, two of them studied by qRT-PCR (Figure 9). The glycosyltransferase superfamily consists of 98 subfamilies and only few have been characterized so far. Only a few members of the UGT72 family been shown to have activity toward flavonoids; such as the seed coat-specific UGT72L1 from Medicago (Zhao et al., 2010) and several seed specific UGTs in L. japonicus (Yin et al., 2017). The UGT72L1 catalyzes (Zhao et al., 2010) formation of epicatechin 3 ′ -O-glucoside (E3 ′ OG), the preferred substrate for MATE transporters. MATE1 mutant display altered seed coat structure and PAs accumulation (Zhao and Dixon, 2011) and also has significantly lower seed dormancy levels (Smýkal, unpublished). The mechanism of PA polymerization is still unclear, but may involve the laccase-like polyphenol oxidase (Zhao et al., 2010). Notably, Medicago myb5 and myb14 mutants exhibit darker seed coat color than wildtype plants, with myb5 also showing deficiency in mucilage biosynthesis, and accumulating only of the PA content of wild-type plants. When myb5 seeds are exposed to water, they germinate readily without dormancy typical of wild type Medicago seeds (R. Dixon, personal communication and P. Smýkal, unpublished). All these observations suggest that PA oligomers play indeed a role in seed coat mediated dormancy.
Our LC/ESI-MS experiments confirm the presence of significantly higher contents of dimer and trimer of gallocatechin (i.e., soluble tannins of prodelphinidin type) in dormant compared to non-dormant pea genotypes. Catechin dimer and trimer was also found in the pea seed coat extracts but their differences between dormant and non-dormant peas are much less significant than their gallocatechin counterparts. This fact points out the significance of hydroxylation of B-ring of PAs in relation to dormancy. The insoluble PAs are the result of oxidative cross-linking with other cell components. Variation in PA content in the pea seeds has been reported (Troszyńska and Ciska, 2002) but not in comparison of wild vs. cultivated peas. PAs play also important roles in defense to pathogens, and because of the health benefits are of industry and medicine interest. PA biosynthesis and its regulation have been dissected in Arabidopsis using transparent testa (tt) mutants, which regulate production, transport or storage of PAs (Lepiniec et al., 2006), and 20 genes affecting flavonoid metabolism were characterized at the molecular level (reviewed in Bradford and Nonogaki, 2009). Many of these flavonoid biosynthesis pathway genes have been found to affect dormancy of Arabidopsis seeds, indicating the role of pigments in this process (Debeaujon et al., 2000). Similarly Medicago also synthesizes PAs in the seed coat, which consists essentially of epicatechin units (Lepiniec et al., 2006;Zhao et al., 2010). Polymerization of soluble phenolics to insoluble polymers is promoted by peroxidases (Gillikin and Graham, 1991) and catecholoxidases (Marbach and Mayer, 1974;Werker et al., 1979), which are abundant in legume seed coats. Positive correlation in content of phenolics, the requirement of oxidation and the activity of catechol oxidase in relation to seed dormancy (germination) in wild vs. domesticated pea seeds have been shown by Marbach and Mayer (1974) and Werker et al. (1979). Recently, epicatechin, cyanidin 3-Oglucoside, and delphinidin 3-O-glucoside were isolated in wild compared to cultivated soybean seed coats (Zhou et al., 2010) with epicatechin being in significant positive correlation with hardseededness.
Beside proanthocyanidin also flavonols, first of all quercetin derivatives, are frequently found in legumes including pea (Dueñas et al., 2004). We found significantly higher content of quercetin rhamnoside in dormant JI64 genotype compared to non-dormant JI92. Similarly, its hydroxylated analog, i.e., myricetin-3-rhamnoside appeared to be a marker of dormancy. This compound was found in many legumes including chickpea, horse gram (Sreerama et al., 2010) and pea (Dueñas et al., 2004). As described in review of Agati et al. (2012) the antioxidant properties of flavonoids represent a robust biochemical trait of organisms exposed to oxidative stress of different origin during plant-environment interactions (regulation of the action of reaction oxygen species, ROS). The effect of ROS on the plant developmental processes including seed germination was described by Singh et al. (2016) including increase of free radical scavenging during pea seed germination (Lopez-Amoros et al., 2006). Presence of phenolic compounds in seed coat might help to protect against fungal diseases during germination as shown in lentil (Matus and Slinkard, 1993).

Pod Dehiscence
Pod maturation might terminate with pod shattering, which is an important trait for seed dispersal of wild species but generally unwanted trait in crops (Fuller and Allaby, 2009). Central to the ballistic seed dispersal in Pisum is the dehiscent pod (single carpel fused along its edges) where the central pod suture undergoes an explosive rupturing along a dehiscence zone (Ambrose and Ellis, 2008). During pod shattering, the two halves of the pod detach due to a combination of the diminished cell walls adhesion in the dehiscence zone, and the tensions established by the specific mechanical properties of drying cells of endo and exocarp of the pod shell. These two principal aspects are shared among families producing dry dehiscent fruit, such as Fabaceae and Brassicaceae (Grant, 1996;Dong and Wang, 2015). The springlike tension within the pod shell is generated during differential drying-induced shrinkage of endocarp and the outer part of the shell-exocarp (Armon et al., 2011). Properties of both the obliquely arranged rigid and lignified inner sclerenchyma as well as the pectin rich exocarp cell with longitudinal orientation are crucial factors of springing in Fabaceae pods. Regulators of their development affecting geometrical arrangement of the layers and their histological properties (cell wall thickness and composition, lignification, hydration) will be key factors generating required tension. Composition and characteristics of pod cell shell cell walls correlate with shattering of yardlong bean and wild cowpea (Suanum et al., 2016) and thickness of the shell and extend of sclerenchymatous dorsal bundle caps was connected with shattering in soy (Tiwari and Bhatia, 1995). The major QTL controlling pod dehiscence in soybean is qPDH1 (QTL for Pod Dehiscence 1). qPDH1 had been recently cloned and shown to encode a dirigent-like protein expressed in the sclerenchyma of differentiating endocarp and modulating the mechanical properties of the pod shell. Lignin biosynthesis is the most likely process affected by qPHD1 (Suzuki et al., 2010;Funatsuki et al., 2014), which might be connected with modulation of torsion within drying pod walls (Funatsuki et al., 2014). However, the precise biochemical activity of qPHD1 is still unclear. Similarly the sclerenchyma differentiation and lignification of the endocarp and valve margin cells of Arabidopsis are central to silique dehiscence with NAC SECONDARY WALL THICKENING PROMOTING FACTOR 1 (NST1) and SECONDARY WALL ASSOCIATED NAC DOMAIN PROTEIN 1 (SDN1) being identified as master regulators of their differentiation (Zhong et al., 2010). Lignification of endocarp and valve margin cells is lost together with dehiscence in the nst1snd1 double mutant (Mitsuda and Ohme-Takagi, 2008). The other decisive point of pod dehiscence is mechanical stability/instability of dehiscence (suture) zone which might trigger the explosive release of pod shell tension (Grant, 1996). Decreased cell to cell adhesion, might be carried out by action of endo-1,4-glucanases and endopolygalacturonases disintegrating the middle lamella in the separation layer (Christiansen et al., 2002). Degradation of pectin in the middle lamella of abscission zone is common theme in fruit shattering (Dong and Wang, 2015). Contrary, mechanism of pod shattering resistance due to reinforcement of the suture has been described in domesticated soybeans NST1/2 homologous transcription factor SHATTERING1-5 (SHAT1-5) from NAC family has been unveiled, inducing excessive secondary cell wall deposition and lignification in the outer part of the suture (fiber cap cells; Dong et al., 2014). Up-regulated expression of SHAT1-5 in domesticated soy "locks" the dehiscence zone interconnecting the vascular bundle caps of sclerenchymatous fibers in the suture vicinity preventing shattering.
The fundamental elements of fruit shattering regulatory network is being uncovered recently in Arabidopsis and homologous genes were identified also in other species and crops (Zea mays, Triticum aestivum, Oryza sativa, Glycine max, Sorghum bicolor, Sorghum propinquum, or Solanum lycopersicum; for review see Dong and Wang, 2015;Ballester and Ferrandiz, 2016). MADS box genes of Arabidopsis SHATTERPROOF1 (SHP1) and SHATTERPROOF2 (SHP2) participate in the dehiscence zone specification . INDEHISCENT (IND) b-HLH transcription factor act down-stream to SHP1/2 as regulator sclerenchyma differentiation in endocarp and valve margin. The shp1/2 double mutant as well as ind produces indehiscent siliques devoid of proper cell specification and differentiation in the dehiscence zone (Liljegren et al., 2004;Dong and Wang, 2015). SHP1 and SHP2 are required for the proper specification of the different cell types within the valve margin and the DZ and both genes probably represent the top of the hierarchy regulating DZ formation . SHATTERPROOF genes along with INDEHISCENT (IDEH) are the main regulators of establishment of lignified layer, which causes pod dehiscence. Another MADS box gene involved in dehiscence zone formation is FRUITFULL (FUL), which expression appears at the inception of the carpel primordia, and soon after becomes restricted to the cells that will give rise to the valves, a pattern that is complementary to that of the SHP genes (Liljegren et al., 2004;Dong and Wang, 2015). Significant up-regulation of SHATTERING1-5 (SHAT1-5) in the fiber cap cells (FCC) of cultivated soybean was shown to be responsible for the excessive cell wall deposition in the FCC, which in turn prevents the pod from committing dehiscence after maturation (Dong et al., 2014). Homologous genes defining dehiscence zone identity and its differentiation INDEHISCENT, SPATULA, SHATTERPROOF, bZIP, and SHATTERING (Ferrándiz et al., 2000;Girin et al., 2011;Dong et al., 2014) were identified in Pisum genome and their abundance was tested in RNA isolated from pod suture tissue of wild and domesticated pea as well as of contrast RILs incurred from crossing of wild and domesticate parent with dehiscent or indehiscent pods. In case of SHATTERPROOF and SHATTERING homologous genes, we found differences in expression between parental lines of pea but we didn't find the similar results in case of contrast RILs. The other homologous genes (INDEHISCENT, SPATULA, and bZIP) did now exhibit any significant difference in expression between dehiscent and non-dehiscent phenotypes.
In the legumes, the pod shattering trait is controlled by one or two dominant genes or QTL. In pea and lentil, genes controlling pod shattering map to a syntenic region, suggesting that the same genes may have been modified during the domestication of the two cool-season legumes (Weeden et al., 2002;Weeden, 2007). Single locus control of pod dehiscence was found in lentil (Ladizinsky, 1998), while two loci in mungbean (Isemura et al., 2012), yardlong bean (Kongjaimun et al., 2012), one controlling the number of twists along the length of the shattered pod, and second the percentage of shattered pods, similarly to two loci found in pea (Weeden et al., 2002;Weeden, 2007), and common bean (Koinange et al., 1996). Bordat et al. (2011) localized Dpo locus responsible for loss of pea pod dehiscence on LGIII. We obtained the similar result by the genome-wide DArTseq analysis. Based on comparison of our candidate gene position in M. truncatula genome and SNPs map (Tayeh et al., 2015) we localized our candidate genes in pea genome. In total of our 25 candidate genes 7 are localized on LGIII, 6 on LGVI, 4 on LGII, 3 on LGV, 2 on LGVII, 2 on LGIV, and 1 on LGI (not shown). MACE-P015, the main candidate gene possibly responsible for pod dehiscence localized on LGIII, is a homolog of peptidoglycan-binding domain protein (PGDB) of M. truncatula (Medtr2g079050). These proteins may have a general peptidoglycan binding function and this motif is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. Many of the proteins having this domain are so far uncharacterized. Matrix metalloproteinases (MMP), which catalyze extracellular matrix degradation, have N-terminal domains that resemble PGBD (Seiki, 1999). On the other hand our candidate MACE-P015 has also 80% match with Cicer arietinum proline-rich extensin-like protein EPR1 (XM_004488673). Extensins are plant specific structural cell-wall proteins (Lamport et al., 2011); they can account for up to 20% of the dry weight of the cell wall and can significantly modulate mechanical cell wall properties through linkages to other cell wall component, which can play a role in pod dehiscence.

New Insights into Pea Seed and Pod Development in Relation to Domestication
Study of biochemical and molecular mechanisms underlying plant domestication process is important area of research in plant biology. In the current study we used a comparative anatomy, metabolomics, and transcriptome profiling of pods and seed coats in wild and domesticated pea in order to identify genes associated with loss of seed dormancy as well as pod dehiscence.
We have identified genes showing differential expression in respective parents as well as phenotypically contrasting RILs. Among others, there were number of genes belonging to phenylpropanoid pathway, which was also identified by metabolomics analysis of seed coat. Our results support the role of proanthocyanidins and their derivatives in physical seed coat mediated dormancy. One of the identified differentially expressed gene involved in pod dehiscence showed significant down-expression in dorsal and ventral pod suture of indehiscent genotypes. Moreover, this homolog of peptidoglycan-binding domain or proline-rich extensin-like protein mapped correctly to predicted Dpo1 locus on PsLGIII. This integrated analysis of the seed coat in wild and cultivated pea raised new questions associated with domestication and seed dormancy. Having underlying gene(s) in hands for various independently domesticated legume crops it would help our understanding of genetic and molecular processes involved in seeds dormancy. Moreover, extended knowledge on control seed dispersal and seed dormancy is necessary for diverse applications-biodiversity conservation as well as breeding.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017. 00542/full#supplementary-material Table S1 | Germination characteristics of parental and RIL lines. Table S2 | Summary statistics for MACE sequencing data and annotation to the Pisum mRNA sequences from NCBI and to the contigs based on the de novo assembly of sequences. Table S3 | List of primers used for qRT-PCR. Table S4 | Complete set of DEGs detected in seed dormancy and pod dehiscence experiments, between wild and cultivated parental genotypes as well as RILs, with indicated homology to annotated pea transcriptome (Tayeh et al., 2015) and MACE expression values.