Original Research ARTICLE
A re-sequencing based assessment of genomic heterogeneity and fast neutron-induced deletions in a common bean cultivar
- 1Plant Science Research Unit, USDA-Agricultural Research Service, St. Paul, MN, USA
- 2Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, USA
- 3Centro de Ciencias Genomicas-Universidad Nacional Autonoma de Mexico, Cuernavaca, Mexico
- 4Hudson Alpha Institute for Biotechnology, Huntsville, AL, USA
- 5Department of Plant Sciences, North Dakota State University, Fargo, ND, USA
- 6Department of Crop and Soil Sciences, University of Georgia, Athens, GA, USA
- 7Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, USA
A small fast neutron (FN) mutant population has been established from Phaseolus vulgaris cv. Red Hawk. We leveraged the available P. vulgaris genome sequence and high throughput next generation DNA sequencing to examine the genomic structure of five P. vulgaris cv. Red Hawk FN mutants with striking visual phenotypes. Analysis of these genomes identified three classes of structural variation (SV); between cultivar variation, natural variation within the FN mutant population, and FN induced mutagenesis. Our analyses focused on the latter two classes. We identified 23 large deletions (>40 bp) common to multiple individuals, illustrating residual heterogeneity and regions of SV within the common bean cv. Red Hawk. An additional 18 large deletions were identified in individual mutant plants. These deletions, ranging in size from 40 bp to 43,000 bp, are potentially the result of FN mutagenesis. Six of the 18 deletions lie near or within gene coding regions, identifying potential candidate genes causing the mutant phenotype.
Common bean, Phaseolus vulgaris L., is an important source of proteins and carbohydrates for over three million people worldwide (Broughton et al., 2003). Despite its dietary importance, genetic resources for common bean have lagged behind those of “model legumes” soybean, Medicago truncatula, and Lotus japonicus. However, next generation sequencing (NGS) technologies now make genomic studies applicable to any species of interest.
Mutants are important tools in deciphering gene functions. In common bean, individual mutants can be created for a gene of interest through plant transformation (Aragao et al., 1996; Kwapata et al., 2012). Gene expression patterns of various genes in common bean can also be knocked out or down through the use of virus induced gene silencing (Diaz-Camino et al., 2011; Zhang et al., 2013). These methods both require prior knowledge of genes of interest. In contrast, mutant populations can be screened for phenotypes of interest and genes responsible identified through various approaches. The analysis of traits in various species including those related to plant architecture, yield, and stress response genes have been improved by utilizing mutant screens (Papdi et al., 2010; Bolon et al., 2011; Ma et al., 2013).
Structural variation (SV), including presence absence variation, inner and intra chromosomal translocations, insertions, and deletions is believed to be an important component of phenotypic diversity in both plants and animals (Lai et al., 2010; Stankiewicz and Lupski, 2010; Cao et al., 2011; Eichten et al., 2011; Wang et al., 2011; McHale et al., 2012). Genomic variation both between and within cultivars has been identified in soybean (Bolon et al., 2011; Haun et al., 2011; McHale et al., 2012), maize (Lai et al., 2010), rice (Huang et al., 2012), and Arabidopsis (Cao et al., 2011; Belfield et al., 2012). Two studies in soybean examining SV between cultivars (McHale et al., 2012) and within the Williams 82 cultivar (Haun et al., 2011) determined the genomic regions most enriched for SV were gene rich regions, particularly regions containing resistance genes.
Here, we present a small fast neutron (FN) mutant population for common bean and demonstrate how NGS technologies, such as DNA-seq provide for fast, high quality analysis of genomic variation to identify potential candidate genes for observed phenotypes. Additionally, the DNA-seq data allowed us to examine the natural variation existing within the Red Hawk cultivar. Such natural variation is a rich source of genomic diversity that can be utilized in future cultivar development.
Materials and Methods
Development of Phaseolus vulgaris Fast Neutron Population
Ten thousand P. vulgaris cv. Red Hawk seeds (Kelly et al., 1998), an Andean cultivar adapted for growth in the upper Midwest, were sent to the McClellan Nuclear Radiation Center at the University of California-Davis for irradiation. Five thousand seeds were treated with either 16 or 32 Gys of FN radiation. Treated seeds were sent to the Illinois Crop Improvement Association (ICIA) facility in Puerto Rico in November 2009 along with 200 wild type seeds from the same seed lot. Approximately 70% of the 5000 seeds treated with 16 Gys germinated, while none of the seeds treated with 32 Gys germinated. Seedlings were allowed to mature at the ICIA facility, where plants were phenotyped and seeds were collected from all mature plants. Seeds from ~88 plants with striking mutant phenotypes such as developmental delays, plant stature, pod set, and pod size variations, were harvested individually. Remaining mutant plants were bulk harvested. Wild type plants grown at ICIA were also bulk harvested.
Seeds from individually collected plants were planted at the University of Minnesota Experiment Station in St. Paul in 2010. Approximately 10,000 seeds from the bulk collection of mutants were also planted. Phenotyping was performed throughout the growth season, complemented by photographs. Selected individuals with visible and/or maturity phenotype variations were harvested. In 2011, seeds from selected 2010 M2 individuals were planted in 10 ft rows (~20 seeds). Phenotypes observed throughout the 2011 growth season were compared to documented phenotypes from previous years to determine if trait expression was consistent. Additionally, segregation among the 20 plants per mutant line was noted. Three to four individuals in each row with visible/stable traits were tagged, photographed and seed was harvested.
DNA-Seq Analysis of Fast Neutron Mutants
Five FN mutant plants with different, stable, obvious phenotypes (Figure 1) were chosen for paired end sequence analysis. The following FN mutant plants chosen: 1R5C01r5CPVMN11, a plant with decorative chlorotic leaves early in the growing season (Figure 1A), 1R19C15r28CPVMN11, a small plant with lanceolate leaves (Figure 1B); 1R22C04r31CPVMN11, an upright plant with rugose leaves (Figure 1C); 2R29C12r78CPVMN11; which phenotypically resembled the wild type plant but was delayed in maturity (Figure 1D); and 3R5C25r87CPVMN11, which exhibited interveinal chlorosis (Figure 1E). The mutant plants will respectively be referred to as lanceolate, rugose, decorative, maturity, and chlorotic throughout the rest of the manuscript. M3 seeds of the plants chosen for sequencing were collected and planted at the University of Minnesota Experiment Station in St. Paul in 2012 to ensure the phenotype was maintained through the M3 generation. Leaf tissue from a representative wild type plant and from each of the chosen mutant plants at the M2 generation was collected from 2011 field-grown plants early in the morning and immediately placed at −80°C to inhibit DNA degradation. DNA from all six plants (WT and five mutants) was extracted using the phenol:chloroform method as described (Liu et al., 1997). Each DNA sample was visually inspected on a 1% agarose gel, to ensure that the samples were not degraded. DNA concentration and purity was assessed using an Agilent 2100® Bioanalyzer™ (Agilent®, Santa Clara, CA). DNA samples were submitted to the molecular biology core at the Mayo Clinic, Rochester, MN for paired end sequencing on an Illumina HiSeq 2000. To reduce variability, DNA from all samples were multiplexed and run in a single lane. Low quality reads and adaptor sequences were removed, resulting in 31 million paired end reads per sample.
Figure 1. Visual phenotype of five Phaseolus vulgaris cv. Red Hawk mutants from the fast neutron mutant population chosen for DNA-seq. All plants are from the M2 generation and were grown at the University of Minnesota Experiment Station in St. Paul, MN in 2011. (A) Mutant 1R5C01r5CPVMN11 referred to as decorative due to the chlorotic patterning on leaves early in the growing season. (B) Mutant 1R19C15r28CPVMN11 referred to as lanceolate due to the elongated shape of the leaf. This mutant also appeared shorter than the WT plants in the field. (C) Mutant 1R22C04r31CPVMN11 referred to as rugose due to the crinkled leaf texture. The petioles of this plant also appeared shorter and more upright than the WT. (D) Mutant 2R29C12r78CPVMN11 referred to as the maturity mutant. This plant is phenotypically identical to the WT except for a delay in maturity. (E) Mutant 3R5C25r87CPVMN11 referred to as the chlorotic mutant due to the interveinal chlorosis patterning observed in the leaves. (F) Wild-type Phaseolus vulgaris cv. Red Hawk for comparison.
Paired-end genome sequences were mapped to the P. vulgaris G 19833 genome sequence available at www.phytozome.net using BWA (Li and Durbin, 2009) with default parameters. The resulting mapping files were further sorted, indexed, and translated to binary format (BAM files) using samtools (Li et al., 2009). The sequence alignments were visualized using IGV (Robinson et al., 2011). This approach aligned 70% of all Red Hawk DNA sequences to 88% of each of the 11 P. vulgaris chromosomes with 12X sequence depth. Regions of genomic deletions were identified using custom perl scripts. To confirm these deletions, the program, CREST (Wang et al., 2011) was also used to screen the FN mutant plants. This program identifies genomic deletions, insertions, inversions, and translocations by identifying soft clipped reads and the read coverage at potential breakpoints to calculate if the probability of observing the number of soft clipped reads at a given location is >0.05 based on a binomial distribution. Statistically significant (P < 0.05) SV are retained for further consideration. CREST analysis was performed for each mutant compared to the wild type control. Genomic deletions resulting from differences between cultivars were masked using the –g function. We chose to focus our analysis on genomic deletions as these are called with the greatest confidence and can be confirmed by visual screening of genomic alignments. Deletions >40 bp identified by both perl scripts and CREST analysis were further characterized by genic location: intergenic, promoter, exon, intron, and 3′UTR. Genomic regions of natural variance within the cultivar were identified by identifying deletions common to multiple, but not all, FN mutant plants (Figures 2A, 3). Single nucleotide polymorphisms (SNPs), small (<40 bp) insertions and deletions (INDELs) (either unique to a single plant or common to multiple plants) were identified using the pileup function in SAMtools (Li et al., 2009). Unique or common SNPs and INDELs were identified using the compareBed function of BEDtools (Quinlan and Hall, 2010) and custom perl scripts. SNPs and INDELs with a read depth <6 (half the average genome coverage) were removed from further analysis. Custom perl scripts were used to characterize SNPs and INDELs by genic locations as described above.
Figure 2. Example of sequence alignments illustrating cultivar heterogeneity and fast neutron induced deletions. Paired end sequences from six Red Hawk individuals were aligned to the G 19833 genome sequence available at www.phytozome.net using BWA (Li and Durbin, 2009) with default parameters. Using CREST and custom perl scripts, we identified statistically significant deletions illustrating three classes of SV. SV was visualized using IGV (Robinson et al., 2011). These regions of SV were all identified on chromosome 1, from the two regions highlighted by filled black boxes above the alignments. The double vertical line within the alignments represents chromosomal region between the two genomic regions depicted. For each individual, a histogram plot illustrates the read depth with individual reads plotted below. Black boxes highlight the three classes of SV. (A) Class 2, regions of genomic heterogeneity within the Red Hawk cultivar. These deletions were identified in two or more Red Hawk individuals and represent the residual heterogeneity in the fast neutron mutant population. This deletion spans 3408 bp on chromosome 1 and is only evident in the lanceolate and chlorotic individuals. (B) Class 1, sequences present in the reference genome, but missing in all Red Hawk individuals. These regions illustrate the differences between the common bean cultivars. This particular region spans 5000 bp of the reference genome. (C) Class 3, sequences missing in a single individual but present in all other lines. This class is most likely the result of fast neutron mutagenesis and may be responsible for the mutant phenotype. This particular deletion is approximately 1500 bp long in the chlorotic mutant and is immediately downstream of the predicted gene Phvul.001G128600 (shown in blue below alignments), which encodes a RecA protein.
Figure 3. CREST analyses identifies SV belonging to Class 2 and Class 3. The number of large genomic deletions in each class identified per 500,000 bp on each chromosome of P. vulgaris were counted and plotted as a vertical line. The height of the line indicates the number of SV per 500,000 bp region that were identified. The largest number of SV per chromosomal region is noted to the left of each chromosome. The mutant containing the deletion is represented by color: blue, chlorotic; purple, maturity; green, decorative; red, rugose; orange, lanceolate. Deletions belonging to Class 3 are likely a result of FN mutagenesis and are highlighted by an asterisk (*). Deletions potentially impacting gene expression are highlighted with an E. Note the region of natural variation (SV Class 2) shared by three mutant plants on chromosome 2.
FN mutant populations have proven to be a valuable asset for genetic studies in a variety of crop species (Starker et al., 2006; Bolon et al., 2011; Xiao et al., 2011). We have established a small FN mutant population for common bean using P. vulgaris cv. Red Hawk. Seeds from 88 plants with stable, visual phenotypes are available for public use (Table 1) by contacting Dr. Carroll Vance at firstname.lastname@example.org or Jeff Roesler at email@example.com. Phenotypes observed in the field include varying degrees and types of chlorosis and altered maturity (usually delayed). Additionally, bulk seed from M2 and M3 plants is available for researchers wishing to screen mutants for a particular trait of interest.
Use of DNA-Seq to Identify Regions of Structural Variation
The DNA-seq reads from the five FN mutant lines and one wild-type cv. Red Hawk individual were mapped to the common bean reference genome sequence (accession G 19833). This analysis allowed us to identify three classes of SV (Figure 2): (1) sequences present in the reference genome sequence, but absent from all six of the Red Hawk individuals (Figure 2B); (2) sequences present in both the reference genome sequence and at least one Red Hawk individual, but absent in two or more Red Hawk individuals (Figure 2A); (3) sequences absent from one FN line, but present in all other samples (Figure 2C).
The Class 1 group primarily represents intra-specific structural genomic differences between accession G 19833 and cv. Red Hawk. We identified over one-thousand of these regions, in which all six Red Hawk individuals were missing >100 bp that is present in the reference genome. This is an interesting group of SV to catalog and may have important implications for understanding inter-cultivar phenotypic variation. However, these features do not inform our understanding of the mutant phenotypes in the FN lines. Therefore, we chose to focus the data analysis on the Class 2 and Class 3 groups, which exhibited structural polymorphism among the FN individuals.
The Class 2 group consists of DNA segments present in some FN individuals, but missing in at least two others (Figure 2A). It is highly unlikely the same genomic regions would be deleted in multiple plants by FN irradiation. Our analysis identified 24 genomic deletions belonging to Class 2, which illustrates the natural variation within the common bean cultivar Red Hawk. Specifically, ten unique deletions (P < 0.05) are shared by the lanceolate and chlorotic mutants on chromosome 1, nine genomic deletions were identified on chromosomes 2 and 11 in three mutant plants (lanceolate, maturity, and chlorotic), four deletions on chromosome 4 are common to the maturity and chlorotic mutants, and one deletion on chromosome 11 is shared by the rugose and maturity mutants (Figure 3, Table 2). Class 2 deletions range in size from 41 base pairs (bp) to 12,111 bp. Of all of these deletions, two are within gene introns and two are within 1000 bp upstream of the start codon or 1000 bp downstream of the stop codon, regions involved in regulating gene expression patterns (Table 2). All Class 2 deletions on chromosome 2 lie within 2.7 million bps of each other. This is a region exhibiting high heterogeneity (Figure 3). Within this region three individual FN lines share eight Class 2 deletions. The region is unaffected in the remaining two FN lines and the wild type plant.
The remaining 18 deletions identified by CREST (P < 0.05) belong to Class 3. This class is composed of sequences absent from a single FN line, but present in all other samples (Figure 2C, Table 3). Deletions belonging to Class 3 range in size from 41 to over 43,000 bp and are found on chromosomes 1, 4, 5, 7, 8, and 9. Twelve of the eighteen Class 3 deletions are in intergenic regions of the genome, though as genome annotation improves these regions may contain as of yet unpredicted genes. Six deletions belonging to Class 3 have the potential to alter gene expression patterns. Regions immediately upstream or downstream of gene coding regions are likely involved in regulating gene expression. Three deletions are located within 1000 bp of gene start or stop codons of Phvul.001G128600, Phvul.004G029200, Phvul.004G031900, and Phvul.004G032000 in the chlorotic and maturity mutants. The latter two genes are tightly linked in the genome, so a single deletion may affect the expression of multiple genes. Two Class 3 deletions shorten the introns of Phvul.001G050100 in the lanceolate mutant and Phvul.004G031200 in the maturity mutant. Finally, a 43,034 bp deletion on chromosome 8 in the chlorotic mutant removes the entire sequence for Phvul.008G141500. In summary, we were able to identify statistically significant SV belonging to Class 3 within either the coding region or the potential regulatory region of predicted genes for three of the five mutant plants (lanceolate, maturity, and chlorotic). Genes likely impacted by these deletions represent the most likely candidates responsible for the mutant phenotype of the lanceolate, maturity, and chlorotic mutants. For decorative and rugose, our analysis pipeline failed to identify Class 3 deletions within the regulatory or coding region of genes. These phenotypes may be a result of a heterozygous deletion, a small (<40 bp) INDEL, or a SNP.
DNA-seq permits genome analyses on a base pair scale. Using the approach described in the materials and methods we estimated that there are 32,499 SNPs and 20,363 INDELs shared by multiple FN lines, most likely representing the natural variation caused by genetic heterogeneity within the Red Hawk cultivar (Class 2). We also estimated 92,205 SNPS and 20,340 INDELs unique to a single FN line (Class 3). As described earlier, Class 3 SNPs and INDELs are most likely a result of the FN mutagenesis. Both Class 2 and Class 3 SNPs and INDELs were mapped to the available P. vulgaris genome to determine whether the change in genomic architecture corresponded to a genic region (Tables 4, 5). As was observed with the larger deletions, the majority of the INDELs and SNPs mapped to intergenic regions. Confirmation by PCR will be necessary to determine the false discovery rate of the SNP and INDEL identification.
We demonstrate the feasibility of utilizing high throughput DNA sequencing to analyze FN mutant plants. The use of high throughput sequencing allows the scale of our analysis to be reduced to a base-pair level, providing for the identification and analysis of SNPs, indels, and larger genomic deletions using a single experimental platform. Our analysis identified SV in each Red Hawk individual (wild type and FN mutants) in comparison to the P. vulgaris (accession G 19833) reference genome sequence (available at www.phytozome.net). These SV are inferred to be sequences lost (deleted) from the respective Red Hawk individuals or sequences recently gained by G 19833. (Identifying sequences missing in G 19833 that are present in Red Hawk requires a de-novo assembly of the Red Hawk genome, which is beyond the scope of this analysis.) One may assume that the majority of the SV is either natural or FN-induced deletions in Red Hawk; therefore these events will be referred to as “deletions” throughout the discussion.
Our analysis identified three classes of SV (Figure 2). Class 1 represents the putative inter-cultivar SV, large sequence segments that are missing from all sequenced Red Hawk individuals in this study, but are present in the current G 19833 assembly (Figure 2B). Class 2 represents the intra-cultivar SV exhibiting differences among Red Hawk individuals (Figure 2A). Class 3 represents SV specific to a single Red Hawk individual, which were potentially generated by the FN irradiation (Figure 2C). We focused our analysis on the 23 Class 2 and 18 Class 3 SV that were identified in this analysis. However, it is important to recognize that these classifications are tentative, as a deeper sampling of Red Hawk individuals may re-classify some variants. For example, the limited sampling of one wild-type and five mutated Red Hawk individuals suggests that some of our Class 3 SV may be low frequency natural variants that would have been identified in more than one Red Hawk individual (and thereby be a Class 2 SV) if a larger number of genotypes had been sampled. Similarly, some Class 1 SV may be present at low frequency in Red Hawk, suggesting that these would be re-classified as Class 2 SV in a deeper sampling of genotypes. It is probable that increasing the number of sequenced individuals from Red Hawk and/or G 19833 individuals would identify many additional Class 1 and Class 2 SV, and strengthen the confidence of the Class 3 calls.
A particularly interesting heterogeneous mixture of deletions was identified in a 2.7 million bp region on chromosome 2 of Red Hawk (Figure 3), and represents an intriguing cluster of Class 2 SV. Sequence analysis of the 159 genes within this region revealed 21 are involved in disease resistance response (one dirigent like protein, six leucine rich repeat proteins, and 14 NB-ARC-LRR domain containing disease resistance proteins). This is consistent with previous studies which identified an over-abundance of disease resistance genes located within regions of natural variation (McHale et al., 2012). Additional regions of Class 2 SV are apparent on chromosomes 1, 4, and 11. Analysis of the DNA-seq data for the five individual FN lines suggests the level of natural variation present within the Red Hawk cultivar is higher than that induced by FN mutagenesis. However, aside from the region on chromosome 2, most of the Class 2 SV is in intergenic regions, not likely impacting gene expression or function.
The Class 3 deletions are particularly interesting, as they may be associated with the mutant phenotypes found in the FN irradiated individuals. For two of the FN lines, no unique deletions >40 bp were found within or near gene coding regions (although it is still possible that these lines carry heterozygous deletions within genic regions). For the remaining three FN lines, however, we identified candidate deletions within gene-encoding regions that may be associated with the resulting phenotype.
The lanceolate mutant (1R19C15r28CPVMN11) exhibited a shorter than wild type stature with elongated, or lanceolate, shaped leaves (Figure 1B). CREST analysis identified a single Class 3 SV potentially altering the expression of Phvul.001G050100 (Table 3, Figure 3). The deletion is entirely contained within the last intron region of the gene. The gene is a member of the glycosyltransferase family 47 subgroup C. Proteins encoded by members of this family are bound to the golgi and are involved in cell wall biosynthesis (Jensen et al., 2008). Altering the expression of member of this family/subgroup would impact the physical property of the pectin matrix (Jensen et al., 2008). However, there are no reports of altered leaf phenotypes or plant height in Arabidopsis knockout populations.
Three regions of SV belonging to Class 3 potentially affecting gene expression in the maturity mutant (2R29C12r78CPVMN11) were identified by CREST analysis. In the field, this plant showed no phenotypic derivation from the wild type, except a delay in maturity (Figure 1D). These three deletions, all on chromosome four, potentially affect the expression pattern of four candidate genes (Figure 3, Table 3). The 1st deletion lies immediately upstream of Phvul.004G029200, which encodes a 60S ribosomal protein. This is an unlikely candidate gene for the phenotype observed. The second deletion is in the intron of Phvul.004G031300. Sequence analysis of this gene failed to identify any functional annotations associated with this gene. The final deletion is immediately downstream of both Phvul.004G032000 and Phvul.004G031900. The Arabidopsis homologs of these genes (At3g01090 and At5g19790) encode a protein kinase and an AP2 transcription factor respectively. AP2 transcription factors are involved in regulating flowering and fruit ripening (Huijser and Schmid, 2011). In Arabidopsis, the over-expression of AP2 results in delayed flowering and maturity (Wollmann et al., 2010). It's possible, the deletion immediately down-stream of Phvul.004G031900 causes an increase of AP2 protein accumulation resulting in delayed flowering and maturity.
Finally, the mutant plant 3R05C25r87CPVMN11 exhibited interveinal chlorosis under standard field conditions (Figure 1E). Interveinal chlorosis is often an indicator of nutrient deficiencies. However, DNA-seq analysis of this mutant revealed two SV belonging to Class 3 potentially affecting gene expression patterns (Table 3). The first is a 1518 bp deletion immediately downstream of Phvul.001G128600. The homolog of this gene in Arabidopsis thaliana (At3g10140) encodes a RecA protein, involved DNA repair by binding ssDNA (Miller-Messmer et al., 2012). The second Class 3 SV is a large deletion spaning 43,034 bp on chromosome 8, encompassing the entire sequence for Phvul.008G141500 (Table 3). Sequence analysis of this gene determined it is a member of the SNF2 helicase family. The Arabidopsis homolog of this gene, At2g44980, is the only member of the ALC1 SNF2 subfamily (Knizewski et al., 2008). There is no reported data on the phenotype of an ALC1 knockout in Arabidopsis. However, in mammalian systems, ALC1 is essential for repairing DNA damage (Ryan and Hughes, 2011). Down-regulation of ALC1 protein results in hypersensitivity to damaging agents (Ryan and Hughes, 2011). We hypothesize a similar function is conserved in common bean. Based on the interveinal chlorotic leaf patterning when this gene is completely excised it's possible Phvul.008G141500 is involved in repairing damage to DNA in the leaves, possibly caused by UV radiation.
We identified Class 3 SV, likely a result of FN mutagenesis, ranging from 1 bp substitutions to 43,034 bp deletions, though changes <40 bp have not been visually verified. In the mutated plants, the level of FN irradiation used (16 Gys) induced far more small (<50 bp) deletions, including single base pair substitutions and deletions, than large genomic deletions. These results are consistent with FN induced deletions identified in a recent paper in Arabidopsis (Belfield et al., 2012). The lack of large SV regions belonging to Class 3, as seen in the well-characterized soybean population (Bolon et al., 2011), may be due to our analysis pipeline requiring the deletion to be complete (i.e.,: homozygous). It is possible some larger deletions are present in these lines, but are not yet homozygous in the M2 generation and, as such, were not identified by our analysis. In the soybean FN population, 52 and 38% of the mutants with visual phenotypes were identified from seeds treated with 16 and 32 Gys respectively (Bolon et al., 2011). None of the P. vulgaris plants irradiated at 32 Gys germinated in the field, suggesting the common bean genome may be less resilient to interruption than the soybean genome. The soybean genome may accommodate larger genomic deletions by compensating for gene loss through altered expression patterns of duplicated genes.
A FN population with 88 individuals or bulk seed from M2 and M3 generations is available upon request. We've illustrated the utility of DNA-seq to identify three classes of SV in P. vulgaris individuals. These analyses were greatly facilitated by the availability of the P. vulgaris genome sequence. In the Red Hawk cultivar, natural variation is clustered in regions throughout the genome. These regions of natural variation illustrate the existing genetic potential of common bean germplasm. Our analyses also identified genomic deletions resulting from FN mutagenesis and candidate genes potentially responsible for the altered phenotype in three of the plants selected.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Jim Kelly at Michigan State University generously contributed the P. vulgaris cv. Red Hawk seeds for this experiment. This work was carried out in part using resources provided by the University of Minnesota Supercomputing Institute.
Aragao, F. J. L., Barros, L. M. G., Brasileiro, A. C. M., Ribeiro, S. G., Smith, F. D., Sanford, J. C., et al. (1996). Inheritance of foreign genes in transgenic bean (Phaseolus vulgaris L.) co-transformed via particle bombardment. Theor. Appl. Genet. 93, 142–150. doi: 10.1007/BF00225739
Belfield, E. J., Gan, X., Mithani, A., Brown, C., Jiang, C., Franklin, K., et al. (2012). Genome-wide analysis of mutations in mutant lineages selected following fast-neutron irradiation mutagenesis of Arabidopsis thaliana. Genome Res. 22, 1306–1315. doi: 10.1101/gr.131474.111
Bolon, Y.-T., Haun, W. J., Xu, W. W., Grant, D., Stacey, M. G., Nelson, R. T., et al. (2011). Henotypic and genomic analyses of a fast neutron mutant population resource in soybean. Plant Physiol. 156, 240–253. doi: 10.1104/pp.110.170811
Cao, J, K., Schneeberger, S., Ossowski, T., Gunther, S., Bender, J., Fitz, D., et al. (2011). Whole-genome sequencing of mulitple Arabidopsis thaliana populations. Nat. Genet. 43, 956–963. doi: 10.1038/ng.911
Diaz-Camino, C., Annamalai, P., Sanchez, F., Kachroo, A., and Ghabrial, S. A. (2011). An effective virus based gene silencing method for functional genomics studies in common bean. Plant Methods 7:16. doi: 10.1186/1746-4811-7-16
Eichten, S. R., Foerster, J. M., de Leon, N., Kai, Y., Yeh, C.-T., Liu, S., et al. (2011). B73-Mo17 near isogenic lines demonstrate dispersed structural variation in maize. Plant Physiol. 156, 1679–1690. doi: 10.1104/pp.111.174748
Haun, W. J., Hyten, D. L., Xu, W. W., Gerhardt, D. J., Albert, T. J., Richmond, T., et al. (2011). The composition and origins of genomic variation among individuals of the soybean reference cultivar Williams 82. Plant Physiol. 155, 645–655. doi: 10.1104/pp.110.166736
Jensen, J. K., Sorensen, S. O., Harholt, J., Geshi, N., Sakuragi, Y., Moller, I., et al. (2008). Identification of a xylogalacturonan xylosyltransferase involved in pectin biosynthesis in Arabidopsis. Plant Cell 20, 1289–1302. doi: 10.1105/tpc.107.050906
Kelly, J. D., Hosfield, G. L., Varner, G. V., Uebersax, M. A., Long, R. A., and Taylor, J. (1998). Registration of ‘Red Hawk’ dark red kidney bean. Crop Sci. 38, 280–281. doi: 10.2135/cropsci1998.0011183X003800010056x
Kwapata, K., Nguyen, T., and Sticklen, M. (2012). Genetic transformation of common bean (Phaseolus vulgaris L.) with the GUS color marker, the BAR herbicide resistance, and the barley (Hordeum vulgare) HVA1 drought tolerance genes. Int. J. Agron. 2012, 198960–198967. doi: 10.1155/2012/198960
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., et al. (2009). The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25, 2078–2079. doi: 10.1093/bioinformatics/btp352
Liu, J., Seul, U., and Thompson, R. (1997). Cloning and characterization of a pollen-specific cDNA encoding a glutamic acid-rich protein (GARP) from potato Solanum berthaultii. Plant Mol. Biol. 33, 291–300. doi: 10.1023/A:1005746713759
McHale, L. K., Haun, W. J., Xu, W. W., Bhaskar, P. B., Anderson, J. E., Hyten, D. L., et al. (2012). Structural variants in the soybean genome localize to clusters of biotic stress-response genes. Plant Physiol. 159, 1295–1308. doi: 10.1104/pp.112.194605
Miller-Messmer, M., Kuhn, K., Bichara, M., Le Ret, M., Imbault, P., and Gualberto, J. M. (2012). RecA-Dependent DNA repair results in increased heteroplasmy of the Arabidopsis mitochondrial genome. Plant Physiol. 159, 211–226. doi: 10.1104/pp.112.194720
Starker, C. G., Parra-Colmenares, A. L., Smith, L., Mitra, R. M., and Long, S. R. (2006). Nitrogen fixation mutants of Medicago truncatula fail to support plant and bacterial symbiotic gene expression. Plant Physiol. 140, 671–680. doi: 10.1104/pp.105.072132
Wang, J., Mullighan, C. G., Easton, J., Roberts, S., Heatley, S. L., Ma, J., et al. (2011). CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat. Methods 8, 652–654. doi: 10.1038/nmeth.1628
Wollmann, H., Mica, E., Todesco, M., Long, J. A., and Weigel, D. (2010). On reconciling the interactions between APETALA2, miR172, and AGAMOUS with the ABC model of flower development. Development 137, 3633–3642. doi: 10.1242/dev.036673
Xiao, J., Xinping, J., Wang, H., Zhao, R., Fang, Y., Gao, R., et al. (2011). A fast-neutron induced chromosomal fragment deletion of 3BS in wheat landrace Wangshuibai increased its susceptibility to Fusarium head blight. Chromosome Res. 19, 225–234. doi: 10.1007/s10577-011-9192-9
Keywords: Phaseolus vulgaris, common bean, natural variation, structural variation, fast neutron mutation, DNA-Seq
Citation: O'Rourke JA, Iniguez LP, Bucciarelli B, Roessler J, Schmutz J, McClean PE, Jackson SA, Hernandez G, Graham MA, Stupar RM and Vance CP (2013) A re-sequencing based assessment of genomic heterogeneity and fast neutron-induced deletions in a common bean cultivar. Front. Plant Sci. 4:210. doi: 10.3389/fpls.2013.00210
Received: 25 March 2013; Paper pending published: 29 April 2013;
Accepted: 03 June 2013; Published online: 25 June 2013.
Edited by:Federico Sanchez, Universidad Nacional Autonoma de Mexico, Mexico
Reviewed by:Antoni Rafalski, Pioneer Hi-Bred International, a DuPont Business, USA
Valerie Geffroy, Institut National de la Recherche Agronomique, France
Copyright © 2013 O'Rourke, Iniguez, Bucciarelli, Roessler, Schmutz, McClean, Jackson, Hernandez, Graham, Stupar and Vance. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
*Correspondence: Jamie A. O'Rourke, Plant Science Research Unit, USDA-Agricultural Research Service, 495 Borlaug Hall, 1991 Upper Buford Circle, University of Minnesota, St. Paul, MN 55108, USA e-mail: firstname.lastname@example.org