Advances in Applied Bioinformatics in Crops

Editorial

12 February 2021

Editorial: Advances in Applied Bioinformatics in Crops

Mary-Ann Blätke

Jedrzej Jakub Szymanski

Evgeny Gladilin

Uwe Scholz

and

Sebastian Beier

4,495 views

4 citations

Editors

Mary-Ann Blätke

Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)

Sebastian Beier

Bioinformatics, Institute of Bio- and Geosciences - 4, Forschungszentrum Jülich GmbH

Uwe Scholz

Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)

Evgeny Gladilin

Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)

Jedrzej Jakub Szymanski

Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)

Impact

Methods

09 July 2020

Comparison Between Core Set Selection Methods Using Different Illumina Marker Platforms: A Case Study of Assessment of Diversity in Wheat

Behnaz Soleimani

, 7 more and

Dragan Perovic

Collections of plant genetic resources stored in genebanks are an important source of genetic diversity for improvement in plant breeding programs and for conservation of natural variation. The establishment of reduced representative collections from a large set of genotypes is a valuable tool that provides cost-effective access to the diversity present in the whole set. Software like Core Hunter 3 is available to generate high quality core sets. In addition, general clustering approaches, e.g., k-medoids, are available to subdivide a large data set into small groups with maximum genetic diversity between groups.

Illumina genotyping platforms are a very efficient tool for the assessment of genetic diversity of plant genetic resources. The accumulation of genotyping data over time using commercial genotyping platforms raises the question of how such huge amount of information can be efficiently used for creating core collections. In the present study, after developing a 15K wheat Infinium array with 12,908 SNPs and genotyping a set of 479 hexaploid winter wheat lines (Triticum aestivum), a larger data set was created by merging 411 lines previously genotyped with the 90K iSelect array. Overlaying the markers from the 15K and 90K arrays enabled the identification of a common set of 12,806 markers, suggesting that the 15K array is a valuable and cost-effective resource for plant breeding programs.

Finally, we selected genetically diverse core sets out of these 890 wheat genotypes derived from five collections based on the common markers from the 15K and 90K SNP arrays. Two different approaches, k-medoids and Core Hunter 3 were compared,and k-medoids was identified as an efficient method for selecting small core sets out of a large collection of genotypes while retaining the genetic diversity of the original population.

7,145 views

43 citations

Methods

30 September 2020

R/UAStools::plotshpcreate: Create Multi-Polygon Shapefiles for Extraction of Research Plot Scale Agriculture Remote Sensing Data

Steven L. Anderson

and

Seth C. Murray

4,083 views

26 citations

Screenshots as illustrations of the most important search and visualization features, which can be used as an entry point for exploratory data analysis. (A) Data Filters: In the “Search Germplasm” panel, the user can search for germplasm by filtering passport attributes, phenotypic traits and SNP markers. (B) Genomic Diversity Visualization: In the integrated SNP browser users can inspect and subset the SNP matrices visually for different collections of germplasm. (C) Combined Interactive Visualizations: This tool enables the correlation of results of dimensionality reduction algorithms like PCA or t-SNE on the SNP data with countries of origin and phenotypic traits of the germplasm. (D) Interactive World Map: This tool allows the user to create lasso selections of geo-localized germplasm or to highlight user-defined germplasm collections with their specific tagging colors. (E) Manhattan Plots: This tool provides interactive plots of GWAS analysis results where each SNP data point is linked to the SNP browser. The user can click on a SNP data point and is then automatically guided to the corresponding genomic location in the SNP browser. (F) PCA Scatterplot Matrix: This visualization tool allows visual inspection of the first four principal components while highlighting user-defined germplasm collections with their specific tagging colors. It also allows to save custom lasso-selection of data points as a “named collection” of germplasm.

Original Research

11 June 2020

BRIDGE – A Visual Analytics Web Tool for Barley Genebank Genomics

Patrick König

, 7 more and

Matthias Lange

Genebanks harbor a large treasure trove of untapped plant genetic diversity. A growing world population and a changing climate require an increase in the production and development of stress resistant plant cultivars while decreasing the acreage. These requirements for improved plant cultivars can be supported by the broader exploitation of plant genetic resources (PGR) as inputs for genomics-assisted breeding. To support this process we have developed BRIDGE, a data warehouse and exploratory data analysis tool for genebank genomics of barley (Hordeum vulgare L.). Using efficient technologies for data storage, data transfer and web development, we facilitate access to digital genebank resources of barley by prioritizing the interactive and visual analysis of integrated genotypic and phenotypic data. The underlying data resulted from a barley genebank genomics study cataloging sequence and morphological data of 22,626 barley accessions, mainly from the German Federal ex situ genebank. BRIDGE consists of interactively coupled modules to visualize integrated, curated and quality checked data, such as variation data, results of dimensionality reduction and genome wide association studies (GWAS), phenotyping results, passport data as well as the geographic distribution of germplasm samples. The core component is a manager for custom collections of germplasm. A search module to find and select germplasm by passport and phenotypic attributes is included as well as modules to export genotypic data in gzip-compressed variant call format (VCF) files and phenotypic data in MIAPPE-compliant ISA-Tab files. BRIDGE is accessible at the following URL: https://bridge.ipk-gatersleben.de.

7,496 views

40 citations

Example of suppression of leaf crossings using the Frangi Filter. From left to right: (A) original image of a wheat shoot, (B) Frangi filter-enhanced image, (C) examples of Frangi-enhanced regions, (D) examples of leaf crossings detected in the original image.

Methods

23 June 2020

Automated Spike Detection in Diverse European Wheat Plants Using Textural Features and the Frangi Filter in 2D Greenhouse Images

Narendra Narisetti

, 2 more and

Evgeny Gladilin

3,639 views

19 citations

Original Research

28 April 2020

Chromosome-Scale Assembly of Winter Oilseed Rape Brassica napus

HueyTyng Lee

, 4 more and

Rod Snowdon

Rapeseed (Brassica napus), the second most important oilseed crop globally, originated from an interspecific hybridization between B. rapa and B. oleracea. After this genome collision, B. napus underwent extensive genome restructuring, via homoeologous chromosome exchanges, resulting in widespread segmental deletions and duplications. Illicit pairing among genetically similar homoeologous chromosomes during meiosis is common in recent allopolyploids like B. napus, and post-polyploidization restructuring compounds the difficulties of assembling a complex polyploid plant genome. Specifically, genomic rearrangements between highly similar chromosomes are challenging to detect due to the limitation of sequencing read length and ambiguous alignment of reads. Recent advances in long read sequencing technologies provide promising new opportunities to unravel the genome complexities of B. napus by encompassing breakpoints of genomic rearrangements with high specificity. Moreover, recent evidence revealed ongoing genomic exchanges in natural B. napus, highlighting the need for multiple reference genomes to capture structural variants between accessions. Here we report the first long-read genome assembly of a winter B. napus cultivar. We sequenced the German winter oilseed rape accession ‘Express 617’ using 54.5x of long reads. Short reads, linked reads, optical map data and high-density genetic maps were used to further correct and scaffold the assembly to form pseudochromosomes. The assembled Express 617 genome provides another valuable resource for Brassica genomics in understanding the genetic consequences of polyploidization, crop domestication, and breeding of recently-formed crop species.

15,263 views

82 citations

Original Research

27 March 2020

Strategies for Effective Use of Genomic Information in Crop Breeding Programs Serving Africa and South Asia

Nicholas Santantonio

, 17 more and

Kelly R. Robbins

Much of the world’s population growth will occur in regions where food insecurity is prevalent, with large increases in food demand projected in regions of Africa and South Asia. While improving food security in these regions will require a multi-faceted approach, improved performance of crop varieties in these regions will play a critical role. Current rates of genetic gain in breeding programs serving Africa and South Asia fall below rates achieved in other regions of the world. Given resource constraints, increased genetic gain in these regions cannot be achieved by simply expanding the size of breeding programs. New approaches to breeding are required. The Genomic Open-source Breeding informatics initiative (GOBii) and Excellence in Breeding Platform (EiB) are working with public sector breeding programs to build capacity, develop breeding strategies, and build breeding informatics capabilities to enable routine use of new technologies that can improve the efficiency of breeding programs and increase genetic gains. Simulations evaluating breeding strategies indicate cost-effective implementations of genomic selection (GS) are feasible using relatively small training sets, and proof-of-concept implementations have been validated in the International Maize and Wheat Improvement Center (CIMMYT) maize breeding program. Progress on GOBii, EiB, and implementation of GS in CIMMYT and International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) breeding programs are discussed, as well as strategies for routine implementation of GS in breeding programs serving Africa and South Asia.

8,853 views

39 citations

Cursory display of the ten most abundant assignments from Kraken2 reports for sample #6 data generated by Illumina and Nanopore platforms, respectively on (A) Family, (B) Genus, and (C) Species level. (D) Krona plot for the Illumina data (17 levels displayed).

Original Research

25 March 2020

Analyzing the Dietary Diary of Bumble Bee

Robert M. Leidenfrost

, 4 more and

Röbbe Wünschiers

6,179 views

19 citations

$0 contigs, displayed for all PN40024 pseudochromosomes individually (chr#), and for the whole PN40024 reference (all chrs) in the leftmost column. Not covered fractions (given in percent) are shown in red (0), fractions covered by a single (1) BoeWGS1.0 contigs in blue, and fractions covered by contig pairs (2) in green. The remaining fractions are covered by three or more BoeWGS1.0 contigs (3, 4, > 4).$