ORIGINAL RESEARCH article
Seed Coat Pattern QTL and Development in Cowpea (Vigna unguiculata [L.] Walp.)
- 1Department of Botany and Plant Sciences, University of California, Riverside, CA, United States
- 2Department of Nematology, University of California, Riverside, CA, United States
- 3Department of Computer Sciences and Engineering, University of California, Riverside, CA, United States
The appearance of the seed is an important aspect of consumer preference for cowpea (Vigna unguiculata [L.] Walp.). Seed coat pattern in cowpea has been a subject of study for over a century. This study makes use of newly available resources, including mapping populations, a reference genome and additional genome assemblies, and a high-density single nucleotide polymorphism genotyping platform, to map various seed coat pattern traits to three loci, concurrent with the Color Factor (C), Watson (W), and Holstein (H) factors identified previously. Several gene models encoding proteins involved in regulating the later stages of the flavonoid biosynthesis pathway have been identified as candidate genes, including a basic helix–loop–helix gene (Vigun07g110700) for the C locus, a WD-repeat gene (Vigun09g139900) for the W locus and an E3 ubiquitin ligase gene (Vigun10g163900) for the H locus. A model of seed coat development, consisting of six distinct stages, is described to explain some of the observed pattern phenotypes.
Cowpea (Vigna unguiculata [L.] Walp.) is a diploid (2n = 22) warm season legume which is primarily grown and serves as a major source of protein and calories in sub-Saharan Africa. Further production occurs in the Mediterranean Basin, Southeast Asia, Latin America, and the United States. Just over 7.4 million metric tonnes of dry cowpeas were reported worldwide in 2017 (FAOSTAT, 2019), though these numbers do not include Brazil, Ghana, and some other relatively large producers. Most of the production in sub-Saharan Africa is by smallholder farmers in marginal conditions, often as an intercrop with maize, sorghum, or millet (Ehlers and Hall, 1997). Due to its high adaptability to both heat and drought and its association with nitrogen fixing bacteria, cowpea is a versatile crop (Ehlers and Hall, 1997; Boukar et al., 2018).
The most common form of consumption is as dry grain. The seeds are used whole or ground into flour (Singh, 2014; Tijjani et al., 2015). Seed coat pattern is an important consumer-related trait in cowpea. Consumers make decisions about the quality and presumed taste of a product based on appearance (Kostyla et al., 1978; Jaeger et al., 2018). Cowpea displays a variety of patterns, including varied eye shapes and sizes, Holstein, Watson, and Full Coat pigmentation, among others (Figure 1). Each cowpea production region has preferred varieties, valuing certain color and pattern traits above others for determining quality and use. In West Africa consumers pay a premium for seeds exhibiting certain characteristics specific to the locality, such as lack of color for use as flour or solid brown for use as whole beans (Langyintuo et al., 2004; Mishili et al., 2009; Herniter et al., 2019a). In the United States consumers prefer varieties with tight black eyes, commonly referred to as “black-eyed peas” (Fery, 1985).
Figure 1 Seed coat pattern traits. Images of lines from various populations demonstrating the phenotypes which were scored as part of this study.
Seed coat traits in cowpea have been studied since the early 20th century, when Spillman (1911) and Harland (1919, 1920), reviewed by Fery (1980), explored the inheritance of factors controlling seed coat color and pattern. In a series of F2 populations Spillman (1911) and Harland (1919, 1920) identified genetic factors responsible for color expression, including “Color Factor” (C), “Watson” (W), “Holstein-1” (H-1), and “Holstein-2” (H-2). A three-locus system controlling seed coat pattern was established by Spillman and Sando (1930) and was confirmed by Saunders (1960) and Drabo et al. (1988), though “O” was used in place of “C.”
A genotyping array for 51,128 single nucleotide polymorphisms (SNP) was recently developed for cowpea (Muñoz-Amatriaín et al., 2017) which offers opportunities to improve the precision of genetic mapping. Numerous biparental populations have been used to map major quantitative trait loci (QTL) for various traits, including root-knot nematode resistance (Santos et al., 2018), domestication-related traits (Lo et al., 2018), and black seed coat color (Herniter et al., 2018) and to develop consensus genetic maps of cowpea (Muchero et al., 2009; Lucas et al., 2011; Muñoz-Amatriaín et al., 2017). In addition, new populations have been developed for higher-resolution mapping including an eight-parent Multi-parent Advanced Generation Inter-Cross (MAGIC) population containing 305 lines (Huynh et al., 2018). A reference genome sequence of cowpea (Lonardi et al., 2019; phytozome.net) and genome assemblies of six additional diverse accessions (Muñoz-Amatriaín et al., 2019) have been produced recently. Here, we make use of these resources to map a variety of seed coat pattern traits, determine candidate genes, and develop a model for genetic control of seed coat pattern. Additionally, we posit a developmental pattern for the cowpea seed coat to explain some of the observed variation.
Materials and Methods
Ten populations were used for mapping: an eight-parent MAGIC population containing 305 lines (Huynh et al., 2018), four biparental recombinant inbred line (RIL) populations, and five F2 populations. Descriptions of each pattern discussed below can be found in Seed Coat PhenotypingSection and examples can be seen in Figure 1.
One biparental population consisted of 87 RILs developed at the University of California, Riverside (UCR), derived from a cross between California Blackeye 27 (CB27), which has a black Eye 2 pattern, and IT82E-18, also known as “Big Buff” (BB), which has a brown Full Coat pattern (Muchero et al., 2009). The second biparental RIL population consisted of 80 RILs developed at UCR derived from a cross between CB27 and IT97K-556-6 (556), which has a brown Full Coat pattern (Huynh et al., 2015). The third biparental RIL population consisted of 101 RILs developed at UCR, derived from a cross between California Blackeye 46 (CB46), which has a black Eye 2 pattern, and IT93K-503-1 (503), which has a brown Eye 1 pattern (Pottorff et al., 2014). The fourth biparental RIL population consisted of 76 RILs developed at UCR and at the International Institute for Tropical Agriculture in Nigeria, derived from a cross between 524B, which has a black Eye 2 pattern, and IT84S-2049 (2049), which has a brown Eye 1 pattern (Menéndez et al., 1997). The F2 populations were developed at UCR as part of this work. Two F2 populations, consisting of 176 and 132 individuals, were developed from independent crosses between CB27 and Bambey 21 (B21), which has the No Color phenotype. One F2 population, consisting of 143 individuals, was developed from a cross between B21 and California Blackeye 50 (CB50), which has a black Eye 2 pattern. Two F2 populations, consisting of 175 and 119 individuals, were developed from independent crosses between Tvu-15426, which has a purple Full Coat pattern, and MAGIC014, a line developed as part of the MAGIC population but not included in the final population, which has a black Watson pattern.
To temporally describe seed coat development four accessions were examined: CB27, MAGIC059, Sanzi, and Sasaque. CB27 is described above. MAGIC059 has the Starry Night pattern in black and purple and is one of the lines included in the MAGIC population. Sanzi has a Speckled pattern in black and purple. Sasaque has the Full Coat pattern in red and purple.
SNP Genotyping and Data Curation
DNA was extracted from young leaf tissue using the Qiagen DNeasy Plant Mini Kit (Qiagen, Germany). A total of 51,128 SNPs were assayed in each sample using the Illumina Cowpea iSelect Consortium Array (Illumina Inc., California, USA; Muñoz-Amatriaín et al., 2017). Genotyping was performed at the University of Southern California Molecular Genomics Core facility (Los Angeles, California, USA). The same custom cluster file as in Muñoz-Amatriaín et al. (2017) was used for SNP calling. In the F2 populations the extracted DNA was bulked by phenotype, with DNA from 20 individuals combined in each genotyped sample.
For the MAGIC population, SNP data and a genetic map were available from Huynh et al. (2018). The map included 32,130 SNPs in 1,568 genetic bins (Huynh et al., 2018). For the biparental RIL populations, SNP data and genetic maps for the CB27 by BB and the CB46 by 503 populations were available from Muñoz-Amatriaín et al. (2017), and SNP data and a genetic map were available for the 524B by 2,049 population from Santos et al. (2018). The CB27 by 556 genetic map was created using MSTMap (Wu et al., 2008). The CB27 by BB genetic map included 16,566 polymorphic SNPs in 977 genetic bins (Muñoz-Amatriaín et al., 2017); the CB27 by 556 genetic map contained 16,284 SNPs in 2,604 bins; the CB46 by 503 genetic map contained 16,578 SNPs in 683 bins (Muñoz-Amatriaín et al., 2017); the 524B by 2,049 genetic map contained 14,202 SNPs in 933 bins (Santos et al., 2018). For each F2 population, SNPs were filtered to remove non-polymorphic loci between the respective parents. The number of markers used for each population is as follows: the two CB27 by B21 populations, 8,550 SNPs (Supplementary Table 1); the B21 by CB50 population, 8,628 SNPs (Supplementary Table 2); the two Tvu-15426 by MAGIC014 populations, 20,010 SNPs (Supplementary Table 3).
Seed Coat Phenotyping
Phenotype data for seed coat traits were collected by visual examination of the seeds. The scored phenotypic classes consisted of No Color, Eye 1, Eye 2, Holstein, Watson, and Full Coat (Figure 1). No Color indicates no pigmentation present on the seed coat. Eye 1 consists of a loose eye in the shape of a teardrop with spots of color outside the eye on the wider side. Eye 2 consists of a tight eye in the shape of two wings with no pigment observed outside the edge of the eye. Holstein consists of an eye with a defined edge and additional spots of pigmentation spread over the seed coat up to almost completely covering the coat. Watson consists of an eye with an indefinite edge. Full Coat consists of pigment completely covering the seed coat. Two of the lines used for observing seed coat development had other seed coat patterns than those mapped. MAGIC014 had the Starry Night pattern, which consists of incomplete pigmentation covering the entire seed. Sanzi had the Speckled pattern, which consists of small dots of pigment covering the seed coat. Seeds with a paler brown color are often difficult to distinguish between the Eye 1 and Watson patterns. The MAGIC population was scored for Eye 1, Eye 2, Holstein, Watson, and Full Coat patterns (Supplementary Table 4). The CB27 by BB (Supplementary Table 5) and CB27 by 556 (Supplementary Table 6) biparental RIL populations were scored for Eye 2, Holstein, Watson, and Full Coat patterns. The CB46 by 503 (Supplementary Table 7) and 524B by 2,049 (Supplementary Table 8) biparental RIL populations were scored for Eye 1, Eye 2, Holstein, Watson, and Full Coat patterns. The CB27 by B21 and B21 by CB50 F2 populations were scored for the No Color and Eye 2 patterns. The Tvu-15426 by MAGIC014 F2 populations were scored for the Watson and Full coat patterns.
For mapping purposes, each observed pattern was scored individually and mapped independently with scores assigned as “1” indicating presence of the trait and a “0” indicating absence. For example, a line expressing the Eye 1 pattern would be scored as “1” for the Eye 1 trait and “0” for all other traits. Pattern phenotypes are mutually exclusive. As the Eye 1 pattern appears to be epistatic towards the H and W loci, any lines with the Eye 1 phenotype were scored as missing data for other seed coat phenotypes to avoid biasing the mapping. This was the case in all populations other than the MAGIC population, as the mpMap script could not operate with such an extent of missing data. In the MAGIC population, for traits other than Eye 1 (Eye 2, Holstein, Watson, and Full Coat), individuals with the Eye 1 phenotype were scored as “0” instead of as missing data since marking too many lines as missing data caused r/mpMap to fail.
Expected segregation ratios reported in Table 2 were determined based on the type of population, parental and F1 phenotypes. For example, the F2 populations were expected to segregate in a 3:1 ratio for traits controlled by single genes with complete dominant/recessive relationships, while the biparental RIL populations were expected to segregate in a 1:1 ratio. Expected segregation ratios were tested by chi-square analysis.
For the MAGIC population, based on how the population was constructed (Huynh et al., 2018) it was assumed that each fully homozygous parent had a roughly 1/8 probability to pass its genotype at a particular locus to a given RIL. For example, at the C locus, three parents (IT84S-2049, IT89KD-288, and IT93K-503-1) express the Eye 1 phenotype and are proposed to have a C1C1 genotype, while the other five parents are proposed to have a C2C2 genotype. Based on this, a given line in the population is expected to have a 3/8 probability of having a C1C1 genotype and a 5/8 probability of have a C2C2 genotype. At the W and H loci, one parent (CB27) is proposed to have the H0H0 and W0W0 genotypes, while the other seven parents are proposed to have the W1W1 and H1H1 genotypes. Based on this, a line should have a 1/8 probability of having the W0W0 and a 1/8 probability of having the H0H0genotype. By multiplying the probabilities at each locus, the probability of a given genotype can be determined using the following equation:
where PC is the probability of a given allele at the C locus, PW is the probability of a given allele at the W locus, PH is the probability of a given allele at the H locus, and Pnet is the probability of a given genotype. For example, the probability of a C2C2H1H1W0W0 genotype, which would have a Holstein phenotype would be 35/512 ([5/8]*[7/8]*[1/8]). The above method results in a predicted 192:5:35:35:245 phenotypic ratio for the Eye 1 (C1C1), Eye 2 (C2C2H0H0W0W0), Holstein (C2C2H1H1W0W0), Watson (C2C2H0H0W1W1), and Full Coat (C2C2H1H1W1W1) patterns, respectively.
Trait mapping was achieved with different methods for each type of population. In the MAGIC population, the R package “mpMap” (Huang and George, 2011) was used as described by Huynh et al. (2018). The significance cutoff values were determined through 1,000 permutations, resulting in a threshold of p = 8.10E−05 [−log10(p) = 4.09]. Due to the high number of markers in the genotype data, imputed markers spaced at 1 cM intervals were used.
In the biparental RIL populations, the R packages “qtl” (Broman et al., 2003) and “snow” (Tierney et al., 2015) were used as in Herniter et al. (2018). Briefly, probability values were assigned to each SNP using a Haley-Knott regression, tested for significance with 1,000 permutations, and marker effects were determined using a hidden Markov model.
For the F2 populations, the genotype calls of each bulked DNA pool in the population were filtered to leave only the markers known to be polymorphic between the parents, and these were then sorted based on physical positions in the pseudochromosomes available from Phytozome (Lonardi et al. 2019; phytozome.net). Each population’s genotype was then examined visually in Microsoft Excel for areas where the recessive bulk was homozygous, and the dominant bulk was heterozygous. Duplicated populations were examined in conjunction.
Determining Haplotype Blocks
Once significant regions were established through mapping analysis, the overlapping area shared between the four biparental RIL populations was examined to determine the minimal area where all four biparental populations had overlapping haplotype blocks. SNPs located in the hotspots of pseudochromosomes Vu07, Vu09, and Vu10 were examined visually in Microsoft Excel for regions of identity within phenotypic groups. SNPs located in the hotspots which had been removed during trait mapping due to high levels of missing data were added back as presence/absence variations and segregated similar to nucleotide polymorphisms.
Determining Candidate Genes
Genes were examined within each minimal haplotype block. Gene expression data (Yao et al., 2016), from the cowpea reference genome (IT97K-499-35), which has a black Eye 1 (C1C1) pattern available from the Legume Information System (legumeinfo.org) were examined for expression in developing seed tissue. Genes encoding proteins known to be involved in regulation of the flavonoid biosynthesis pathway were prioritized.
Determining Allelic Series
Dominance relationships were determined by examining the phenotypes of several F1 progeny in addition to segregation ratios in the F2 populations. Crosses were made between CB27 and three lines from the CB27 by BB population (BB-090, BB-113, and BB-074). Seeds from these F1 plants were visually examined for seed coat patterns. CB27/BB-090 seeds had a Watson pattern (C2C2H0H0W1W1), CB27/BB-113 seeds had a Holstein pattern (C2C2H1H1W0W0), and CB27/BB-074 seeds had a Full Coat pattern (C2C2H1H1W1W1). An additional cross was available from the early development of the MAGIC population, where the phenotype of the seed coat on seeds from a maternal C1C2 heterozygote was Full Coat. IT84S-2246 (Full Coat, C2C2H1H1W1W1) was crossed with IT93K-503-1 (Eye 1, C1C1H1H1W1W1) to yield this C2C1H1H1W1W1 maternal parent.
Comparing Sequence Variation
The genome sequences of the candidate genes from each of five genome sequences (the reference genome sequence and four additional genome assemblies) and about 3 kb of upstream sequence were compared using A plasmid Editor (ApE; jorgensen.biology.utah.edu/wayned/ape/). Transcription factor binding sites were predicted in the upstream regulatory region of each gene using the binding site prediction function available from the Plant Transcription Factor Database (Jin et al., 2017; planttfdb.cbi.pku.edu.cn/). The species input was Vigna radiata (mung bean), as a map of cowpea was unavailable. The cowpea reference sequence is of IT97K-499-35. Among the additional sequenced genomes, CB5-2 has the Eye 2 pattern (C2C2), Suvita-2 has the Full Coat pattern (C2C2H1H1W1W1), Sanzi has a Speckled pattern, and UCR779 has the Full Coat pattern (C2C2H1H1W1W1). See Seed Coat Phenotyping Section for pattern descriptions and Figure 1 for examples.
A larger set of SNPs (about 1 million), discovered from whole-genome shotgun sequencing of 37 diverse accessions (Muñoz-Amatriaín et al., 2017; Lonardi et al. 2019), was available from Phytozome (phytozome.net). Among the 37 accessions, 28 had phenotype data available. These lines were examined for variations in the SNP selection panel that were in the gene-coding and regulatory regions of the candidate genes.
The 28 lines from the SNP selection panel with phenotype and genotype data available were tested for correlation in R, using the native “cor.test” function. For input, the phenotype was recorded as “+1” for accessions with the Eye 1 (C1C1) phenotype and “−1” for those without. The genotype was recorded as “+1” for accessions matching the reference genotype, “−1” for the alternate homozygote, and “0” for the heterozygote (Supplementary Table 9).
Seed Color Development
The four accessions for which pattern development was recorded (CB27, MAGIC059, Sanzi, and Sasaque) were grown in a greenhouse at the University of California, Riverside (Riverside, California; 33.97° N 117.32° W) at a constant temperature of about 32°C from March through May 2018. Three plants were used for each accession. Upon flowering, each flower was tagged with the date it opened. The flowers were permitted to self-fertilize. For each day after the flower opened, beginning on the second day, on each of the three test plants a pod was collected until no more green pods were observed.
Seeds from each collected pod were photographed using a Canon EOS Rebel T6i at a 90º angle under consistent lighting conditions. The length of the most advanced seed within the pod was measured using ImageJ (imagej.nih.gov). A developmental scale from 0 to 5 was designed based on the visual observations of the spread of pigmentation (see Results section). Each photograph was scored using this scale.
Phenotypic Data and Segregation Ratios
Phenotypic data and proposed genotypes for each parent in the observed populations can be found in Table 1. A summary of the phenotypic data, along with predicted segregation ratios, chi-square values, and probability can be found in Table 2.
Identification of Loci Controlling Seed Coat Pattern
A total of 35 SNP loci were identified using different methods for each population type (see Materials and Methods section for details) and were concentrated on three chromosomes: Vu07 (C locus), Vu09 (H locus), and Vu10 (W locus). Mapping results can be found in Supplementary Table 10. The overlapping mapping results allowed a narrowing of the area examined for candidate genes.
Determination of Minimal Haplotype Blocks
Following trait mapping, all called SNPs on chromosomes Vu07, Vu09, and Vu10 were examined for minimal haplotype blocks in the overlapping significant regions in the four biparental RIL populations. On Vu07 (C locus) the minimal haplotype block was between 2_12939 and 2_09638 (228,331 bp) and contained ten genes. On Vu09 the minimal haplotype block was between 2_33224 and 2_12692 (166,724 bp) and contained seventeen genes. On Vu10 the minimal haplotype block was between 2_12467 and 2_15325 (120,513 bp) and contained eleven genes. The list of candidate genes can be found in Supplementary Table 11 and on Phytozome (Lonardi et al., 2019; phytozome.net). The minimal haplotype block .0regions can be found in Supplementary Table 12.
I890-dentification of Candidate Genes
A predominant candidate gene was identified at each locus based on high relative expression in the developing seeds (Supplementary Figure 1) and a review of the literature on the regulation of the flavonoid biosynthesis pathway (see Discussion section for details). This led to the determination of a single major candidate gene on each of Vu07, Vu09, and Vu10. Each of the candidate genes belongs to a class which is known to be involved in transcriptional control of the later stages of flavonoid biosynthesis. No Color, Eye 1, and Full Coat mapped to an overlapping area on Vu07, where the gene Vigun07g110700, encoding a basic helix–loop–helix protein, was noted as a strong candidate gene. Eye 2, Holstein, Watson, and Full Coat mapped to a similar area on Vu09, where the gene Vigun09g139900, encoding a WD-repeat gene, was noted as a strong candidate gene. Eye 1, Eye 2, Holstein, Watson, and Full Coat mapped to an overlapping area on Vu10, where the gene Vigun10g163900, encoding an E3 ubiquitin ligase protein with a zinc finger, was noted as a strong candidate gene.
Determination of Allelic Series
Segregation ratios indicated the dominance of H1 over H0 (Holstein locus, Figure 2E, Gii ), W1 over W0 (Watson locus, Figure 2Gi), C2 over C0 (Color Factor locus, Figure 2F), and C2 over C1 (Color Factor locus, Figure 2Giv). The dominance relationship between the C1 and C0 alleles could not be determined from these data.
Figure 2 Interaction of seed coat pattern loci. (A) Table displaying the pattern loci identified in mapping, their locations, the trait encoded, alleles identified, and phenotypes. (B) Table displaying the allelic series and relative dominance of alleles. (C) Segregation patterns for the CB27 by BB and CB27 by 556 F8 populations. (D) Segregation patterns for the CB46 by 503 and 524B by 2049 F8 populations. (E) Segregation pattern for the Tvu-15426 by MAGIC014 F2 populations. (F) Segregation pattern for the CB27 by B21 and B21 by CB50 F2 populations. (G) Phenotype of seeds from the F1 plants resulting from a series of crosses (i) Cross between CB27 and line from the CB27 by BB population with a Watson pattern, resulting in Watson pattern. (ii) Cross between CB27 and a line from the CB27 by BB population a Holstein pattern, resulting in Holstein pattern. (iii) Cross between CB27 and a line from the CB27 by BB population with a Full Coat pattern, resulting in a Full Coat pattern. (iv) Cross between IT84S-2246 and IT93K-503-1 from the early development of the MAGIC population, resulting in a Full Coat pattern in the seed coats on seeds of the F1 maternal parent.
Sequence Comparisons of Candidate Genes
Multiple sequence alignments for each of the three candidate genes and regulatory regions (∼3 kb upstream of the transcription start site) revealed SNPs and small insertions or deletions (Supplementary Datasets 1,2, and 3). None of the variants in the transcript sequence were predicted to cause changes in the amino acid sequence.
The regulatory region of Vigun07g110700 (C locus candidate gene) showed a C/T SNP variation between the reference genome and the four other genome sequences on Vu07 at 20,544,306 bp. The reference genome has a T at this position while the other four sequences have a C. Transcription factor binding site prediction from the Plant Transcription Factor Database (planttfdb.cbi.pku.edu.cn/) indicated that this variation constitutes either a WRKY binding site in the C allele or an ERF binding site in the T allele. Of the 28 accessions in the SNP selection panel, eleven expressed the Eye 1 (C1) pattern and 17 did not. Twenty accessions had a CC genotype, six had a TT genotype, and two had a TC genotype. The correlation test gave an estimated correlation value of 0.75, with a p-value of 3.51E-06, indicating significant correlation between the genotype and phenotype values such that this SNP is a reliable marker for distinguishing between the C1 (Eye 1) and the C2 (Eye 2) alleles. Two of the 28 lines had the No Color (C0) phenotype, but had the CC genotype, indicating that this SNP is not a good marker for the C0 allele (for a possible explanation see Discussion section). The regulatory region of Vigun09g139900 (W locus candidate gene) showed a C/T variation between the reference genome and CB5-2 against the other three genome sequences on Vu09 at 30,207,722 bp. This SNP was not included in the list from the SNP selection panel and so could not be examined like the previous SNP. Transcription factor binding site prediction did not indicate that the site was a target for any transcription factor in either form. The upstream regulatory region of Vigun10g163900 (H locus candidate gene) did not have any distinguishing variation.
Stages of Color Development
A model of seed coat development has been formulated consisting of six stages based on the spread of pigmentation. In Stage 0, there is no color on the seed coat. In Stage 1, color appears at the base of the hilum. In Stage 2, color appears around the hilum. In Stage 3, color begins to spread along the outside edges of the seed. In Stage 4, color begins to fill in on the edges of the testa. In Stage 5, the color has completely developed to the mature level. After Stage 5 the pod and seeds begin to desiccate. Of the observed varieties, only Sasaque and Sanzi completed all six stages. MAGIC059 reached Stage 4, while CB27 only reached Stage 2. No seeds in Stage 0 were observed for Sasaque. Images of each tested variety at various stages can be seen in Figure 3. Color development was associated with seed size; the pigmentation spread as the seeds grew larger.
Figure 3 Seed coat color development. Images showing the development the seed and the spread of pigmentation.
Segregation Ratios and Epistatic Interaction of Seed Coat Pattern Loci
Segregation ratios and dominance data (Table 2 and Figure 2) in the tested populations were consistent with a three gene system with simple dominance and epistatic interactions that matches the C (Color Factor), W (Watson), and one of the H (Holstein) factors identified by Spillman (1911) and Harland (1919, 1920). In brief, the C locus encodes a “constriction” factor while the W and H loci encode distinct “expansion” factors. The C locus is the primary locus controlling seed coat pattern. Pigmentation may be not visible (No Color, C0), constrained to an eye (Eye 1, C1), or distributed throughout the seed coat (Eye 2, Holstein, Watson, or Full Coat, C2). The extent of distribution is modified by the H and W loci, whose contribution is visible only with an unconstrained allele (C2) at the C locus. In the presence of Holstein (H1) and absence of Watson (W0), a Holstein pattern is expressed. Conversely, in the presence of Watson (W1) and absence of Holstein (H0), a Watson pattern is expressed. In combination, the Watson (W1) and Holstein (H1) factors result in the Full Coat phenotype.
Based on the above proposed allelic series, an individual with the C0C0 genotype will express the No Color pattern, regardless of the genotypes at the W and H loci, and an individual with the C1C1 genotype will express the Eye 1 pattern, regardless of the genotypes at the W and H loci. However, when not constricted by a C0 or C1allele (having the C2 allele) the “expansion” factors can be observed. An individual with the C2––W0W0H1–– genotype expresses the Holstein pattern, while and individual with the C2––W1––H0H0 genotype expresses the Watson pattern. An individual with the C2––W1––H1–– genotype, with both “expansion” factors, expresses the Full Coat pattern. An individual with theC2––W0W0H0H0 genotype expresses the Eye 2 pattern. In this latter case the eye pattern is observed despite the unconstricted C2 allele due to the absence of the “expansion” factors. Based on this model, the CB27 by BB and CB27 by 556 populations segregate at the W and H loci (Figure 2C), while the MAGIC, CB46 by 503, and 524B by 2,049 populations segregate at all three loci (Figure 2D). Similarly, the Tvu-15426 by MAGIC014 populations segregate at the W locus (Figure 2E) and the CB27 by B21 and B21 by CB50 populations segregate at the C locus (Figure 2F).
An additional pattern phenotype of Blue-grey Ring was noted in some of the tested populations. Blue-grey Ring consists of a pale ring of bluish-grey surrounding the eye (Figure 1). It appears only with the Eye 1 (C1) phenotype but is not always present when the phenotype is Eye 1 (C1). The Blue-grey Ring phenotype may represent another (fourth) allele at the C locus, or it may result from a combination of the C1 (Eye 1) allele and other pigmentation genes. However, from other unpublished work on seed coat color there does not appear to be a strict correlation between seed coat color and presence of the Blue-grey Ring. Further research is required to clarify the basis of the Blue-grey Ring phenotype.
Pattern Traits Qtl Overlap
Several regions of the genome are hotspots for seed coat pattern traits (Supplementary Table 11). These correspond to locations of genetic factors identified by Spillman (1911) and Harland (1919, 1920), who identified four factors controlling seed coat patterning: Color Factor (C), Watson (W), Holstein-1 (H-1), and Holstein-2 (H-2). The present data suggest the presence of only one Holstein locus or that the two loci are very closely linked in the tested populations. To avoid possible confusion, the Holstein locus discussed here is simply termed “H.”
The major QTL and regions of interest for No Color and Eye 1 are clustered in an overlapping region on Vu07, suggesting that the “constriction” factor at locus C is at that position with allelism at the locus. Mapping results from the Tvu-15426 by MAGIC014 F2 populations indicate that the H locus is on Vu10. Additional evidence for the H locus being located on Vu10 comes from Wu et al. (2019), who identified the Anasazi locus (equivalent to the cowpea H locus) on chromosome 10 of common bean, which is homologous to Vu10 (Lonardi et al., 2019). While none of the biparental F2 populations segregated solely for the W locus, the identification of the C locus on Vu07 and the H locus on Vu10 must, by process of elimination, identify the location of the W “expansion” locus on Vu09.
Seed Coat Pattern Is Due to Failure to Complete the Normal Color Developmental Program
It was noted that the varieties with the Full Coat pattern at maturity followed the developmental pattern described in Stages of Color Development Section and shown in Figure 3 to completion. In contrast, varieties which do not display the Full Coat pattern appear to have color development arrested at certain points. This is most obvious in CB27 (Eye 2, C2), where color development proceeds only to Stage 2. It is likely that other varieties which have distinct eye sizes proceed to varied stages of development. For example, varieties with the No Color (C0) phenotype would not proceed past Stage 0. However, the three gene model presented here does not explain every seed coat pattern. An example is the pattern observed in mature Sanzi seed, which exhibits a Speckled black and purple seed coat (see Seed Coat Phenotyping Section for a description and Figure 1). According to this analysis, Sanzi completes all six stages of seed coat development, indicating that the Speckled pattern is controlled separately. A biparental RIL population, consisting of lines derived from a cross between Sanzi and Vita 7, which has a brown Full Coat pattern (C2C2W1W1H1H1), was used for mapping the black seed coat color; there was a perfect correlation between black seed coat color and the Speckled pattern (Herniter et al., 2018). This indicates that genetic control of the Speckled pattern is colocalized with black seed coat color and may be an allele at the Bl locus, which is located on Vu05.
Further research is needed to determine if all cowpea accessions follow the pattern observed in the four tested lines shown in Figure 3. It may be that each of the observed stages of seed coat pigmentation development is controlled by a different gene, and that failures of normal gene function cause the observed variation in patterning. Evidence for this model is furnished by the noted developmental pattern of the seed coats where development appears to be arrested at Stage 2 in CB27, which expresses the Eye 2 (C2) pattern, and at Stage 4 in MAGIC059, which expresses the Starry Night pattern (see Seed Coat Phenotyping Section for a description and Figure 1). The mechanism by which this occurs is not elucidated here and requires further research. Transcriptome data could be gathered for the seed coat at each developmental stage. The currently available transcriptome data (Yao et al., 2016; legumeinfo.org) used whole seeds at specific days post flowering and do not distinguish between transcripts in the seed coat and those in the embryo or cotyledons, and further do not separate transcripts by developmental stage.
Candidate Gene Function
The later steps in flavonoid biosynthesis are controlled by a transcription factor complex composed of an R2-R3 MYB protein, a basic helix-loop-helix protein (bHLH), and a WD-repeat protein (WD40; Xu et al., 2015). E3 Ubiquitin ligases (E3UL) are believed to negatively regulate this complex (Shin et al., 2015). The color and location (leaf, pod, seed coat) of the pigmentation are determined by expression patterns (Wu et al., 2003; Iorizzo et al., 2018). Candidate genes on Vu07 (C locus) and Vu09 (W locus) encode a bHLH and WD40 protein, respectively. A candidate gene on Vu10 (H locus) encodes an E3UL protein. This information lends itself to a model in which Vigun07g110700 (bHLH) serves as a “master switch” controlling the extent of pigmentation constriction while Vigun09g139900 (WD40) and Vigun10g163900 (E3UL) act as “modulating switches” controlling the type of expanded pattern, altering the effect of the pathway to result in the observed Holstein and Watson patterns (Figure 4). The R2-R3 MYB directs the DNA binding of the complex, with expression of different genes in different tissues resulting in the observed color and location of the pigments. For example, MYB genes identified by Herniter et al. (2018) are required for black seed coat and purple pod tip color. Further, Vigun07g110700 (bHLH) was identified as a candidate gene controlling flower color in cowpea by Lo et al. (2018), indicating a possible dual function of the gene. Indeed, Harland (1919, 1920) noted that a lack of pigment in the flower was often associated with a lack of pigment in the seed coat. Finally, homologs of Vigun07g110700 have been identified in other legumes as Mendel’s A gene controlling flower color in Pisum sativum (Hellens et al., 2010) and as the P gene in Phaseolus vulgaris (McClean et al., 2018).
Figure 4 Proposed roles of the C, W, and H genes. Transcription of flavonoid biosynthesis pathway genes are controlled by a complex composed of three types of proteins (Xu et al., 2015), a basic helix-loop-helix protein (bHLH; e.g., Vigun07g110700, C locus), a WD-repeat protein (WD40; e.g., Vigun09g139900, W locus), and an R2R3 MYB transcription factor. This complex is in turn negatively regulated by an E3 Ubiquitin ligase (E3UL; e.g., Vigun10g163900, H locus). Sequence comparisons suggest that bHLH transcription may be controlled by ERF and WRKY proteins. The observed seed coat pattern phenotypes are a result of different alleles and expression patterns.
Two R2R3 MYB genes (Vigun10g165300 and Vigun10g165400) are located only 110 kb downstream of Vigun10g163900 (H locus candidate gene). However, these fall outside of the haplotype blocks identified in the CB27 by BB and CB27 by 556 populations, indicating that they are not the source of the observed phenotypic variation. However, there may be interaction between one or both of these MYBs and the E3UL responsible for the Holstein pattern; this hypothesis could be investigated through additional research.
The observed C/T SNP variation in the regulatory sequence of Vigun07g110700 (bHLH) at 20,544,306 bp constitutes a difference between a WRKY binding site in the C2 (Eye 2) allele versus an ERF binding site in the C1 (Eye 1) allele. WRKY proteins are positive regulators of seed coat pigment biosynthesis in Arabidopsis (Lloyd et al., 2017) while ERF proteins negatively regulate the same pathway (Matsui et al., 2008). This SNP could be used as a genetic marker to distinguish between the C1 and C2 alleles. The lack of correlation between an observed marker and the C0 (No Color) allele may be caused by other variants, such as a small deletion interrupting gene function, which has been shown in P. vulgaris (McClean et al., 2018). Such a variation would not be detected by the genotyping platform used for this study. Similarly, the observed C/T SNP variation in the regulatory region of Vigun09g139900 at 30,207,722 bp could be used as a marker to distinguish between the W0 (not Watson) and W1 (Watson) alleles, despite not necessarily being the cause of the observed phenotypic variation. No single variation was identified for Vigun10g163900 alleles. However, haplotype blocks determined from the biparental RIL populations can be used for future breeding efforts. Two SNPs which fall within the genome sequence of Vigun10g163900 segregate with the phenotype in the biparental RIL populations. At 2_24359, the lines with the H0 (not Holstein) allele have an A genotype and the lines with the H1 (Holstein) allele have a G genotype. At 2_24360, the lines with the H0 (not Holstein) allele have an A and the lines with the H1 (Holstein) allele have a C. Future research is needed to develop more perfect markers for the three loci.
Data Availability Statement
All datasets [SNPs] for this study are included in the manuscript/Supplementary Files.
IH performed all trait mapping, statistical analysis, and interpretation. RL performed analysis of the seed coat development. MM-A assisted in trait mapping and provided SNP data. SaL assisted in trait mapping. Y-NG extracted DNA for genotyping. B-LH provided the MAGIC population and its genotypic information. ML performed crosses used for allelic series analysis. ZJ assisted in statistical analysis. PR and TC provided guidance and access to population and genetic resources. StL assisted with the SNP selection panel data. TC assisted IH with the writing.
This study was supported by the Feed the Future Innovation Lab for Climate Resilient Cowpea (USAID Cooperative Agreement AID-OAA-A-13-00070), the National Science Foundation BREAD project “Advancing the Cowpea Genome for Food Security” (NSF IOS-1543963), and Hatch Project CA-R-BPS-5306-H.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This manuscript has been released as a Pre-Print in bioRxiv (Herniter et al., 2019b). The authors thank Amy Litt for helpful discussion and guidance on pattern development; Eric Castillo and Sabrina Phengsy for assistance with seed photography; Steve Wanamaker for assistance in the analysis of the various genome sequences.
2049, IT84S-2049; 503, IT93K-503-1; 556, IT97K-556-6; B21, Bambey 21; BB, Big Buff (IT82E-18); bHLH, basic helix-loop-helix; C, Color Factor; CB27, California Blackeye 27; CB46, California Blackeye 46; CB50, California Blackeye 50; E3UL, E3 Ubiquitin ligase; GWA, Genome-Wide Association; H, Holstein; MAGIC, Multiparent Advanced Generation InterCross; QTL, quantitative trait locus; RIL, recombinant inbred line; SNP, single nucleotide polymorphism; UCR, University of California, Riverside; W, Watson; WD40, WD-repeat.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01346/full#supplementary-material
Supplementary Figure 1 | Relative expression levels of the candidate genes. TPM, Transcripts per million; dap, days after pollination. Data retrieved from legumeinfo.org.
Drabo, I., Ladieinde, T. A. O., Smithson, J. B., Redden, R. (1988). Inheritance of eye pattern and seed coat colour in cowpea (Vigna unguiculata [L.] Walp.). Plant Breed. 100 (2), 119–123. doi: 10.1111/j.1439-0523.1988.tb00226.x
Fery, R. L. (1985). “Improved cowpea cultivars for the horticultural industry in the USA,” in Cowpea Research, Production and Utilization. Eds. Singh, S. R., Rachie, K. O.(London, UK: John Wiley & Sons, Inc.), 129–135.
Hellens, R. P., Moreau, C., Lin-Wang, K., Schwinn, K. E., Thomson, S. J., Fiers, M. W. E. J., et al. (2010). Identification of mendel’s white flower character. PLoS One 5 (10), e13230. doi: 10.1371/journal.pone.0013230
Herniter, I. A., Muñoz-Amatriaín, M., Lo, S., Guo, Y.-N., Close, T. J. (2018). Identification of candidate genes controlling black seed coat and pod tip color in cowpea (Vigna unguiculata [L.] Walp). G3 8 (10), 3347–3355. doi: 10.1534/g3.118.200521
Herniter, I. A., Lo, R., Muñoz-Amatriaín, M., Lo, S., Guo, Y.-N., Huynh, B.-L., et al. (2019b). Identification of seed coat pattern trait QTL and a description of seed coat development in cowpea (Vigna unguiculata [L.] Walp.). bioRxiv [Preprint]. (Accessed March 26, 2019). doi: 10.1101/514455
Huang, B. E., George, A. W. (2011). R/mpMap: a computational platform for the genetic analysis of multiparent recombinant inbred lines. Bioinformatics 27 (5), 727–729. doi: 10.1093/bioinformatics/btq719
Huynh, B.-L., Ehlers, J. D., Huang, B. E., Muñoz-Amatriaín, M., Lonardi, S., Santos, J. R. P., et al. (2018). A Multi-Parent Advanced Generation Inter-Cross (MAGIC) population for genetic analysis and improvement of cowpea (Vigna unguiculata L. Walp.). Plant J. 93 (6), 1129–1142. doi: 10.1111/tpj.13827
Huynh, B.-L., Ehlers, J. D., Ndeve, A., Wanamaker, S., Lucas, M. R., Close, T. J., et al. (2015). Genetic mapping and legume synteny of aphid resistance in African cowpea (Vigna unguiculata L. Walp.) grown in California.Mol. Breed. 35 (1), 36. doi: 10.1007/s11032-015-0254-0
Iorizzo, M., Cavagnaro, P. F., Bostan, H., Zhao, Y., Zhang, J., Simon, P. W. (2018). A cluster of MYB transcription factors regulates anthocyanin biosynthesis in carrot (Daucus carota L.) root and petiole. Front. Plant Sci. 9, 1927. doi: 10.3389/fpls.2018.01927
Jaeger, S. R., Antúnez, L., Ares, G., Swaney-Stueve, M., Jin, D., Harker, F. R. (2018). Quality perceptions regarding external appearance of apples: insights from experts and consumers in four countries. Postharvest Bio. Tech. 146, 99–107. doi: 10.1016/j.postharvbio.2018.08.014
Jin, J., Tian, F., Yang, D.-C., Meng, Y.-Q., Kong, L., Luo, J., et al. (2017). PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res. 45, D1, D1040–D104 doi: 10.1093/nar/gkw982
Langyintuo, A. S., Ntoukam, G., Murdock, L., Lowenberg-DeBoer, J., Miller, D. J. (2004). Consumer preferences for cowpea in Cameroon and Ghana. Agric. Econ. 30 (3), 203–213. doi: 10.1111/j.1574-0862.2004.tb00189.x
Lo, S., Muñoz-Amatriaín, M., Boukar, O., Herniter, I., Cisse, N., Guo, Y.-N., Roberts, P.A., Xu, S., Fatokun, C., Close, T.J. (2018). Identification of QTL controlling domestication-related traits in cowpea (Vigna unguiculata L. Walp). Sci. Rep. 8(1): 6261. doi: 10.1038/s41598-018-24349-4
Lloyd, A., Brockman, A., Aguirre, L., Campbell, A., Bean, A., Cantero, A., et al. (2017). Advances in the MYB–bHLH–WD repeat (MBW) pigment regulatory model: addition of a WRKY factor and co-option of an anthocyanin MYB for betalain regulation. Plant Cell Physiol. 58 (9), 1431–1441. doi: 10.1093/pcp/pcx075
Matsui, K., Umemura, Y., Ohme-Takagi, M. (2008). AtMYBL2, a protein with a single MYB domain, acts as a negative regulator of anthocyanin biosynthesis in Arabidopsis. Plant J. 55 (6), 954–967. doi: 10.1111/j.1365-313X.2008.03565.x
McClean, P. E., Bett, K. E., Stonehouse, R., Lee, R., Pflieger, S., Moghaddam, S. M., et al. (2018). White seed color in common bean (Phaseolus vulgaris) results from convergent evolution in the P (pigment) gene. New Phytol. 219 (3), 1112–1123. doi: 10.1111/nph.15259
Menéndez, C. M., Hall, A. E., Gepts, P. (1997). A genetic linkage map of cowpea (Vigna unguiculata) developed from a cross between two inbred, domesticated lines. Theor. Appl. Genet. 95 (8), 1210–1217. doi: 10.1007/s001220050683
Mishili, F. J., Fulton, J., Shehu, M., Kushwaha, S., Marfo, K., Jamal, M., et al. (2009). Consumer preferences for quality characteristics along the cowpea value chain in Nigeria, Ghana, and Mali. Agribusiness 25 (1), 16–35. doi: 10.1002/agr.20184
Muchero, W., Diop, N. N., Bhat, P. R., Fenton, R. D., Wanamaker, S., Pottorff, M., et al. (2009). A consensus genetic map of cowpea [Vigna unguiculata (L) Walp.] and Synteny Based on EST-Derived SNPs. PNAS 106 (43), 18159–18164. doi: 10.1073/pnas.0905886106
Muñoz-Amatriaín, M., Mirebrahim, H., Xu, P., Wanamaker, S. I., Luo, M. C., Alhakami, H., et al. (2017). Genome resources for climate-resilient cowpea, an essential crop for food security. Plant J. 89 (5), 1042–1054. doi: 10.1111/tpj.13404
Muñoz-Amatriaín, M., Lonardi, S., Liang, Q., Shu, S., Wanamaker, S. I., Lo, S., et al. (2019). Genome resources for cowpea (Vigna unguiculata [L.] Walp). Poster session presented at: plant and animal genome conference; 2019 Jan 11-16. San Diego, California.
Pottorff, M., Roberts, P. A., Close, T. J., Lonardi, S., Wanamaker, S., Ehlers, J. D. (2014). Identification of candidate genes and molecular markers for heat-induced brown discoloration of seed coats in cowpea [Vigna unguiculata (L.) Walp]. BMC Genomics 15 (1), 328. doi: 10.1186/1471-2164-15-328
Santos, J. R. P., Ndeve, A. D., Huynh, B.-L., Matthews, W. C., Roberts, P. A. (2018). QTL mapping and transcriptome analysis of cowpea reveals candidate genes for root-knot nematode resistance. PLoS One 13 (1), 1–22. doi: 10.1371/journal.pone.0189185
Shin, D. H., Cho, M., Choi, M. G., Das, P. K., Lee, S.-K., Choi, S.-B., et al. (2015). Identification of genes that may regulate the expression of the transcription factor production of anthocyanin pigment 1 (PAP1)/MYB75 involved in Arabidopsis anthocyanin biosynthesis. Plant Cell Rep. 34 (5), 805–815. doi: 10.1007/s00299-015-1743-7
Tierney, L., Rossini, A. J., Li, N., Sevcikova, H. (2015). snow: simple network of workstations. https://cran.r-project.org/package=snow
Wu, D., Hought, J., Baseggio, M., Hart, J. P., Gore, M. A., Ilut, D. C. (2019). Genomic characterization of the native seeds/search common bean (Phaseolus vulgaris L.) Collection and its seed coat patterns. Gen. Res. Crop Evo. 66 (7), 1469–1482. doi: 10.1007/s10722-019-00823-4
Wu, Y., Bhat, P. R., Close, T. J., Lonardi, S. (2008). Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph. PLoS Genet. 4 (10), e1000212. doi: 10.1371/journal.pgen.1000212
Yao, S., Jiang, C., Huang, Z., Torres-Jerez, I., Chang, J., Zhang, H., et al. (2016). The Vigna unguiculata Gene Expression Atlas (VuGEA) from de novo assembly and quantification of RNA-seq data provides insights into seed maturation mechanisms. Plant J. 88 (2), 318–327. doi: 10.1111/tpj.13279
Keywords: Vigna unguiculata, cowpea, seed coat, pigment, pattern, quantitative trait loci
Citation: Herniter IA, Lo R, Muñoz-Amatriaín M, Lo S, Guo Y-N, Huynh B-L, Lucas M, Jia Z, Roberts PA, Lonardi S and Close TJ (2019) Seed Coat Pattern QTL and Development in Cowpea (Vigna unguiculata [L.] Walp.). Front. Plant Sci. 10:1346. doi: 10.3389/fpls.2019.01346
Received: 25 March 2019; Accepted: 27 September 2019;
Published: 25 October 2019.
Edited by:Matthew Nicholas Nelson, Commonwealth Scientific and Industrial Research Organisation, Australia
Reviewed by:Kirstin E. Bett, University of Saskatchewan, Canada
Leonardo Velasco, Institute for Sustainable Agriculture, Spain
Copyright © 2019 Herniter, Lo, Muñoz-Amatriaín, Lo, Guo, Huynh, Lucas, Jia, Roberts, Lonardi and Close. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Ira A. Herniter, firstname.lastname@example.org
†Current address: Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO, United States