ORIGINAL RESEARCH article
Front. Plant Sci.
Sec. Plant Breeding
Optimizing training sets to identify superior genotypes in hybrid populations
Provisionally accepted- 1National Taiwan University, Taipei, Taiwan
- 2Cornell University, Ithaca, United States
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Identifying superior hybrids from candidate populations is a central goal in plant breeding, particularly for commercial applications and large-scale cultivation. This study evaluates and extends several promising training set optimization methods in genomic selection (GS) to construct predictive models for identifying top-performing genotypes in hybrid populations. The methods investigated include: (i) a ridge regression-based approach, MSPE(v2) Ridge, (ii) a generalized coefficient of determination-based method, CDmean(v2) and (iii) an A-optimality-like ranking strategy, GVaverage. To assess predictive performance in identifying genotypes with the highest true breeding values (TBVs), three evaluation metrics were developed. Because TBVs are latent quantities derived from models, simulation experiments based on real genotype data from wheat (Triticum aestivum L.), maize (Zea mays), and rice (Oryza sativa L.) were carried out to assess the proposed methods. Results demonstrated that GVaverage not only achieved substantial computational efficiency but also generally generated highly informative training sets across a broad range of sizes. However, when constructing small training sets, GVaverage occasionally failed to maintain adequate genomic diversity. In such cases, CDmean(v2) is recommended as a more reliable alternative. Overall, the proposed framework provides a flexible and effective approach to optimize training sets for hybrid breeding, thereby enhancing the accuracy of genomic prediction in practical breeding programs.
Keywords: genomic best linear unbiased prediction model, Genomic prediction, hybridperformance, plant breeding, whole genome regression model
Received: 05 Sep 2025; Accepted: 09 Dec 2025.
Copyright: © 2025 Liao and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Chen-Tuo Liao
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.