AUTHOR=Gouy Matthieu , Bogard Matthieu , Mohamadi Faharidine , Demenou Boris TITLE=Optimizing core collections for genetic studies: a worldwide flax germplasm case study JOURNAL=Frontiers in Plant Science VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2025.1675815 DOI=10.3389/fpls.2025.1675815 ISSN=1664-462X ABSTRACT=Core collections provide a strategic approach to reducing population size while retaining genetic diversity and allele frequencies, serving as key resources for genetic research. Although various sampling and selection strategies have been proposed, most of them focused on either diversity or representativeness, rarely both, and none fully integrated these with QTL detection optimization. The first part of our study focuses on a genetic diversity analysis of a flax germplasm (Linum usitatissimum L.) maintained by the Arvalis Institute, a prerequisite for the development of core collections. This germplasm is a worldwide flax collection comprising 1,593 accessions originating from 42 countries, encompassing all major flax-growing regions. It includes both spring- and winter-type lines, as well as oilseed and fiber types. The results revealed a pronounced genetic structure within the germplasm with six clusters, strongly influenced by cultivation purposes (fiber vs. oilseed flax), growth cycle (winter vs. spring), and then geographic origin. Overall genetic diversity was moderate (He = 0.22), with oilseed flax clusters displaying greater diversity (He from 0.21 to 0.27) than fiber flax (He < 0.17). In a second step we evaluated distinct strategies for core-collection development, including approaches -originally developed for core collection construction and others- developed for optimizing genomic‐selection calibration panels. We introduced an approach based on QTL detection performance via extensive simulations of QTLs distributed across the genome. We observed a fundamental trade-off between maximizing diversity and ensuring representativeness in core collection design. Diversity-oriented approaches may overemphasize rare or outlier genotypes, compromising representativeness, while representativeness-focused strategies leaded to overlooking rare alleles, thus limiting diversity. In our results we have found that particular combinations of selection criteria achieved a favorable balance between genetic diversity and representativeness, while concurrently maintaining a robust capacity to capture QTL signals across the genome. Finally, the approach using the Shannon index combined with the allelic coverage led to optimal core collection design adapted for GWAS applications in a structured population; and was used to select a core collection of 409 accessions useful for further genetic studies. These results provide knowledge for the development of optimized core collections tailored to GWAS applications.