AUTHOR=Crosta Margherita , Romani Massimo , Nazzicari Nelson , Ferrari Barbara , Annicchiarico Paolo TITLE=Genomic prediction and allele mining of agronomic and morphological traits in pea (Pisum sativum) germplasm collections JOURNAL=Frontiers in Plant Science VOLUME=Volume 14 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2023.1320506 DOI=10.3389/fpls.2023.1320506 ISSN=1664-462X ABSTRACT=Well-performing genomic prediction (GP) models for polygenic traits and molecular marker sets for oligogenic traits could be useful for identifying promising genetic resources in germplasm collections, establishing core collections, and marker-based variety distinction. This study aimed at (i) defining GP models and key marker sets for predicting 15 agronomic or morphophysiological traits in germplasm collections, (ii) verifying the usefulness of these models also for selection in breeding programs, (iii) investigating the consistency between marker-based and morphophysiological trait-based diversity patterns, and (iv) identifying genomic regions associated to the target traits. The study was based on phenotyping and over 41,000 genotyping-by-sequencing-generated SNP marker data of 220 landraces or old cultivars belonging to a world germplasm collection and 11 modern cultivars. Non-metric multi-dimensional scaling (NMDS) and an analysis of population genetic structure indicated a high level of genetic differentiation of material from Western Asia, a major West-East genetic diversity gradient, and quite limited genetic diversity of the improved germplasm. Mantel’s test revealed a low correlation (r = 0.12) between phenotypic and molecular diversity, which increased (r = 0.47) when considering only to the molecular diversity relative to significant SNPs in following GWAS analyses. The GWAS identified, inter alia, several areas of chromosome 6 involved in a largely pleiotropic control of vegetative or reproductive organ pigmentation. We found several significant SNPs for grain yield and straw yield under severe drought and onset of flowering, and one SNP on chromosome 5 for grain protein content. GP models displayed moderately high predictive ability (0.43 to 0.61) for protein content, grain and straw yield, and onset of flowering, and high predictive ability (0.76) for individual seed weight, based on intra-population, intra-environment cross-validations. The inter-population, inter-environment validation of the models trained on the germplasm collection for selection within three RIL populations, which was challenged by much narrower diversity of the target material, an over eight-fold decrease of available markers and quite different test environments, led to a loss of predictive ability of about 40% for seed weight, 50% for protein content and straw yield, and 60% for onset of flowering, and nil prediction for grain yield.