Characterizing Croatian Wheat Germplasm Diversity and Structure in a European Context by DArT Markers

Narrowing the genetic base available for future genetic progress is a major concern to plant breeders. In order to avoid this, strategies to characterize and protect genetic diversity in regional breeding pools are required. In this study, 89 winter wheat cultivars released in Croatia between 1936 and 2006 were genotyped using 1,229 DArT (diversity array technology) markers to assess the diversity and population structure. In order to place Croatian breeding pool (CBP) in a European context, Croatian wheat cultivars were compared to 523 European cultivars from seven countries using a total of 166 common DArT markers. The results show higher genetic diversity in the wheat breeding pool from Central Europe (CE) as compared to that from Northern and Western European (NWE) countries. The most of the genetic diversity was attributable to the differences among cultivars within countries. When the geographical criterion (CE vs. NWE) was applied, highly significant difference between regions was obtained that accounted for 16.19% of the total variance, revealing that the CBP represents genetic variation not currently captured in elite European wheat. The current study emphasizes the important contribution made by plant breeders to maintaining wheat genetic diversity and suggests that regional breeding is essential to the maintenance of this diversity. The usefulness of open-access wheat datasets is also highlighted.


INTRODUCTION
Common wheat (Triticum aestivum) is an allohexaploid, combining the genomes of three ancestral diploid grass species, the A-genome of Triticum urartu, the B-genome from a species related to Aegilops speltoides and the D-genome of Aegilops tauschii (Dvorak and Zhang, 1992). The allopolyploid nature and origin of common wheat undoubtedly contributes to its adaptability since its progenitors grow in a wide range of environments from the southern coast of the Caspian Sea, across northern Iran, Turkmenistan, and northern Afghanistan to China (Zohary et al., 1969).
Historically, domestication was the first bottleneck in reducing genetic variation in many crops (Morgante and Salamini, 2003). In wheat the bottleneck was accentuated further as the interspecific crosses that gave rise to hexaploid wheat occured only a few times (a founder effect). Furthermore, early farmer selection caused genetic drift and depletion of certain alleles from a gene pool (Cox, 1998). A second bottleneck was caused by the post-Mendelian adoption of breeding procedures separating environmental from genetic effects (Morgante and Salamini, 2003) and contributing to the depletion and reduction of diversity by replacing local landraces with newly improved varieties (Harlan, 1972).
A major concern for plant breeders is the narrowing of the genetic base of breeding material which lowers the likelihood of genetic progress. van de Wouw et al. (2010) conducted a meta-analysis of genetic diversity studies in the 20th century for different crop varieties, suggesting there were no clear general trends in genetic diversity for crop varieties released in the last century. What their study revealed was a significant reduction in the diversity of released varieties in the 1960s, but even then the diversity reduction as compared with the diversity levels in the 1950s was only 6%.
There are several issues to be addresed in monitoring genetic diversity.
It is important to identify the initial replacement of landraces by modern varieties, to assess the impact of selective breeding on genetic erosion caused by replacing unique with common alleles and accounting for the influence of seed companies releasing similar varieties in different regions.
Furthermore, it has been observed that limited breeding activity leads to less diversity narrowed by the number of released varieties, which is a possible threat to farmer's or seed producer's ultimate choice of varieties (van de Wouw et al., 2010).
Population structure analysis also provides deeper understanding of genetic diversity in a given germplasm set and the necessary information for association mapping in which accurate estimates of population structure are needed for the control of relatedness in mixed-model association mapping studies (Yu et al., 2006;Zhu et al., 2008). Bayesian model-based clustering, which models variation in ancestral subpopulations along a chromosome as a Markov process (Astle and Balding, 2009), assigns an individual to one of K populations based on the information of its genotype and information about the distribution of the various alleles in K populations (Pritchard et al., 2000) providing insight about gene flow patterns and migration rates (Guillot and Carpentier-Skandalis, 2011).
Because of the all above-mentioned, it is important to have available reliable tools for monitoring and measuring genetic diversity and population structure.
The assessment of genetic diversity relied on the use of pedigree information, morphological (passport data) and biochemical (isozymes and storage proteins) markers (Mohammadi and Prasanna, 2003). With these approaches, genetic diversity among major field crops such as bread wheat (Cox et al., 1986;Souza et al., 1994), durum wheat (Autrique et al., 1996), maize Smith, 1987, 1988), barley (Martin et al., 1991), and soybean (Cardy and Beversdorf, 1984;Cox et al., 1985) was described. These assessments based on pedigree information are sufficiently reliable (when data about parents are correct), but sometimes they do not mirror accurately parentage and do not take into account the effects of selection, mutation and genetic drift. In contrast, DNA markers allow the assessment of genetic relationship at the DNA level directly (Laidò et al., 2013).
The most common type of DNA markers that have been used in assessing genetic diversity in wheat is microsatellite or SSRs markers. Microsatellites have been proposed as one of the most suitable markers for the assessment of genetic diversity among wheat accessions, because they are multi-allelic, abundant, chromosome specific and evenly distributed along chromosomes (Röder et al., 1995). The major drawback of microsatellites is they require to be isolated de novo for each species (or a group of closely related species), because they are mostly located in non-coding regions where the nucleotide substitution rate is higher than in coding regions (Zane et al., 2002).
Since this time, marker systems have evolved to include DArT markers which assays for the presence (or amount) of a specific DNA fragment from the total genomic DNA, and simultaneously types several thousand loci in a single assay (Jaccoud et al., 2001). Although more advanced tools, such as SNP markers have become more widely used, Akbari et al. (2006) validated and demonstrated that DArTs perform very well in revealing the genetic relationship among bread wheat varieties and behave in a Mendelian fashion concluding that they can be effectively and informatively used to genotype polyploid species such as wheat.
The main objectives of this study were to use DArTs for (1) assessing genetic diversity available in the Croatian winter wheat breeding pool; (2) describing genetic population structure; (3) providing new information about the level of genetic diversity and structure from a single eco-geographic region; and (4) placing the genetic relationship in the Croatian breeding pool (CBP) in the wider context of European-wide winter wheat diversity as facilitated by combined, open-access DArT genotypes.  Table S1). The cultivars were bred in two regions: eastern Croatia (PIO) and western Croatia (Bc; FAZ).

European Breeding Pool (EBP)
In order to place the CBP in a European context, accessible DArT genotypes from two published winter wheat panels were included in this study. The first was the TriticeaeGenome (TG) panel: a panel of 376 elite wheat varieties from France, Germany and the UK (Bentley et al., 2014, dataset available at www.cerealsdb.uk.n et). The second was a European diversity panel (ED) consisting of 94 mostly European wheat varieties recently described by Nielsen et al. (2014). A summary of the number of inbred lines in each panel, and across the panels, and the number of countries represented is given in Supplementary Table S2.

DArT Genotyping
For DArT analysis of the CBP, DNA was extracted from young wheat leaf tissue from a single plant of each genotype using the protocol recommended by Triticarte Pty. Ltd. (http://www. triticarte.com.au/content/DNA-preparation.html) and sent to Triticarte for DArT analysis using common wheat PstI(TaqI) version 2.5 array. A total of 1531 DArT markers were scored, out of which 1,229 markers were retained, based on Q value (an estimate of marker quality) above 80, for further analysis.
Additional DArT data representing the EBP was collated from previous datasets (Bentley et al., 2014;Nielsen et al., 2014). There were 16 overlapping lines between the TG and ED panels and duplicates were retained where there were genotyping discrepancies (10 lines). Across the CBP and EBP only countries represented by at least 10 cultivars and common markers, removing those with ≥10% missing data, were used. The data was thinned by removing one marker from each pair of markers with an absolute correlation coefficient of >0.95 and a minor allele frequency of 0.05. The final dataset combining the CBP and EBP consisted of 523 lines from seven countries and 166 DArT markers (Supplementary Table S2). The 166 DArT markers were assigned map positions based on published mapping information (Huang et al., 2012).

Genetic Diversity of Croatian Winter Wheat Breeding Pool
Genetic diversity of Croatian winter wheat cultivars from two breeding programs (PIO and Bc/FAZ) as assessed using DArT markers was analyzed using the following parameters: percentage of polymorphic loci (%P), Shannon's diversity index (Sh; Lewontin, 1972), effective number of alleles (N E ; Berg and Hamrick, 1997), expected heterozygosity (H E ; Nei, 1973), and polymorphic information content (PIC; Botstein et al., 1980).
In order to quantify rare markers in wheat cultivars from each program we calculated the modified 'frequency downweighted marker values (DW)' based on the measure proposed by Schönswetter and Tribsch (2005) and implemented in AFLPdat (Ehrich, 2006). Instead of counting the number of occurrences of each marker in each group of cultivars and dividing it by the number of occurrences of that particular marker in the total dataset, we divided the frequency of each marker in each group of cultivars by the frequency of that particular marker in the total dataset. Finally, the rarity index (RI) for each program was calculated as average over all marker loci: where I is the number of markers, p ij is the frequency of ith marker in a group of cultivars j and P i is the frequency of ith marker in the total dataset. Thus, we even out the unequal sample sizes. The value of RI is expected to be higher for a program in which overall rare markers were frequent among cultivars. The estimates of diversity parameters between cultivars were compared using repeated measures analysis of variance carried out using PROC GLM in SAS v. 9.2 (SAS Institute, 2004).
Genetic distances between pairs of cultivars were calculated by Dice's distance coefficient (D Dice; Dice, 1945). Cluster analysis based on dissimilarity matrix was performed using the Neighbor joining (NJ) method and the statistical support of the branches was tested with bootstrap analysis using 1,000 replicates (Felsenstein, 1985). The calculations were made using PAST version 2.01 (Hammer et al., 2001).
The analysis of molecular variance (AMOVA; Excoffier et al., 1992) using ARLEQUIN ver. 3.0 (Excoffier et al., 2005) was used to partition the molecular diversity based on DArT markers separately between and within two Croatian wheat breeding programs, namely PIO and Bc/FAZ. The variance components were tested statistically by non-parametric randomisation tests using 10,000 permutations.
A model-based clustering method was applied to infer genetic structure of Croatian winter wheat breeding pool using the software STRUCTURE ver. 2.3.3 (Pritchard et al., 2000). Ten runs per each cluster (K) ranging from 1 to 11 were carried out on the Isabella computer cluster at the University of Zagreb, University Computing Centre (SRCE). Each run consisted of a burn-in period of 200,000 steps followed by 10 6 Monte Carlo Markov Chain (MCMC) replicates assuming an admixture model and correlated allele frequencies. The choice of the most likely number of clusters (K) was determined according to ad hoc statistic K, as described by Evanno et al. (2005) and as implemented in Structure-sum Ver. 2011 (Ehrich et al., 2007).
The admixture was quantified using Shannon's diversity index based on proportions of membership in each cluster inferred by STRUCTURE. In this sense, Shannon's diversity index will be equal to zero for a cultivar having 100% of its genome estimated to belong to a cluster while it will reach its maximum value when the proportions of membership of a given cultivar are equal in all clusters (the most admixed cultivar). Furthermore, the total diversity (Sh Total ) of a breeding program could be calculated from the mean proportions of membership and related to the mean within-cultivar diversity (Sh Mean ) calculated by averaging the Shannon's diversity indices of the cultivars belonging to a breeding program. Thus, the admixture level of a breeding program can be separated into the proportion of admixture attributable to within-cultivar (Sh Mean /Sh Total ) and among-cultivar [(Sh Total -Sh Mean )/Sh Total ] admixture.

Genetic Relationships Among Winter Wheat Cultivars from Different European Countries
Diversity of DArT markers among wheat cultivars from seven European countries (Croatia, Denmark, France, Germany, Hungary, Sweden, UK) has been assessed using the same parameters as above (% P, Sh, N E , H E , PIC). The estimates of diversity parameters among countries were compared as above.
Post hoc Bonferroni's adjustments were used to compare the means of diversity estimates among countries at significance level P < 0.05.
In order to graphically represent genetic relationships among wheat cultivars from seven European countries, a factorial correspondence analysis (FCA) was carried out using Genetix 4.05 (Belkhir et al., 2004).
The AMOVA using ARLEQUIN was used to partition the molecular diversity of wheat cultivars based on DArT markers (A) within and among seven European countries and (B) among regions (Central Europe vs. Northern and Western Europe), among countries within regions and within countries. The Central European region was comprised of Croatia and Hungary while the Northern and Western European region included the remaining countries. Pairwise comparisons examined with AMOVA resulted in φ ST values that are equivalent to the proportion of total variance that is partitioned between groups of cultivars originating from two different countries. To obtain a distance matrix, φ ST values were interpreted as the inter-country distance average (Román et al., 2001). A cluster analysis based on the φ ST matrix was performed using the UPGMA method, as above.
A model-based clustering method was applied to infer genetic structure of Croatian winter wheat breeding pool using the software STRUCTURE ver. 2.3.3 (Pritchard et al., 2000). The genetic structure of European wheat cultivars population structure was assessed using STRUCTURE, as above.

Genetic Diversity and Population Structure of Croatian Winter Wheat Breeding Pool
A total of 1,229 polymorphic DArT markers were included in the diversity analysis of 89 Croatian wheat cultivars. The Shannon's diversity index (Sh) used to quantify the genetic diversity was 0.78 for PIO and 0.76 for Bc/FAZ. The average effective number of alleles per locus (N E ) was 1.63 and 1.61, the expected heterozygosity (H E ) 0.36 and 0.35, with an average of 0.38, while the polymorphism information content (PIC) was 0.29 and 0.28, with an average of 0.30, were found for PIO and Bc/FAZ, respectively. The rarity index (RI) was 0.985 for PIO and 1.036 for Bc/FAZ. A slightly higher genetic diversity was found in the winter wheat breeding pool from eastern Croatia (PIO), while overall rare markers were found more frequently among cultivars from western Croatia (Bc/AFZ; Table 1). The average genetic distance (D Dice ) estimated for DArTs was 0.27. Four indistinguishable pairs of cultivars were found (D Dice = 0.00: C02 Afrodita/C12 Dvanaesta; C43 Nada/C61 Ratarka; C44 Neretva/C65 Ruža; C47 Njivka/C59 Poljarka), all originating from the breeding program carried out at the Agricultural Institute Osijek (PIO). The highest genetic distance (D Dice = 0.49) was calculated between cultivars C31 Lana and C38 Marija bred by the Bc Institute for Plant Breeding and Production of Field Crops Production, Zagreb (Bc).
The Neighbor joining tree shows differentiation of the CBP into two clusters (Figure 1). The larger group included 72 cultivars of different origin (49 from PIO and 23 from Bc/AFZ) suggesting that breeders used the same or similar genetic material as parental lines. As a result, some cultivars from different breeding programs could show greater similarity than those from the same breeding program. The smaller group was comprised of 17 cultivars (14 from PIO and 3 from Bc/AFZ), that were mostly newly registered cultivars, except the cultivar Marija (Bc), which was one of the mostly widespread cultivars in the region during the 1980s and frequently used in crossings as a one of the parents.
Hierarchical analysis of genetic diversity using AMOVA was performed to analyze the partitioning of the genetic variation between and within breeding programs in CBP. Although most of the genetic diversity was attributable to differences among cultivars within a breeding program (94.36%), the significant φ ST values among breeding programs (φ ST = 0.06; P < 0.0001) suggested the existence of moderate level of genetic differentiation between programs.
The population structure of Croatian wheat cultivars was assessed using the Bayesian model-based clustering method. In general, average estimates of the likelihood of the data, conditional on a given number of clusters, ln[Pr(X| K)], kept increasing with higher K (number of clusters), but the standard deviations among different runs for each K followed the same pattern (Figure 2A). The highest K values were observed for K = 2 (5.47) followed by that at K = 5 (4.20). The proportion of membership of each Croatian wheat cultivar in each cluster at K = 2 and 5 is shown in Figure 1. At K = 2 the great majority of the cultivars were assigned to cluster A while the cluster B was comprised of 17 cultivars in complete concordance with the results of distance-based clustering analysis (Figure 1). All the cultivars had membership probabilities higher than 90% in a particular cluster. At K = 5 the majority of cultivars assigned to cluster B at K = 2 were included into cluster E. values for each of the ten independent runs for each K and K values for each K based on the second order rate of change of the likelihood function with respect to K described by Evanno et al. (2005).
The mean proportion of membership of Croatian wheat cultivars bred by the Agricultural Institute Osijek (PIO) and the Bc Institute for Plant Breeding and Production of field crops or the University of Zagreb, Faculty of Agriculture (Bc/FAZ) in each cluster at K = 5 is shown in Supplementary Figure S1.
The mean proportions of membership of PIO cultivars were in each of the five clusters ranged from 0.099 to 0.293. Bc/FAZ cultivars showed substantially higher proportions of membership in two clusters (D: 0.462; B: 0.290) while the proportions in the rest of the clusters was lower than 0.100. Shannon's diversity index based on proportions of membership in each of the five clusters was used to assess the admixture levels of the Croatian breeding programs. Total Shannon's diversity index of PIO cultivars (Sh Total = 0.674) was slightly higher than those of Bc/FAZ (Sh Total = 0.577) cultivars while the mean within-cultivar diversity of PIO cultivars (Sh Mean = 0.285) was lower than those of Bc/FAZ (Sh Mean = 0.301). Thus, the admixture attributable to within-cultivar component was lower than that attributable to among-cultivar component in case of PIO breeding program (42.31 vs. 57.69%, respectively) while considering Bc/FAZ breeding program the opposite was true (52.28 vs. 47.72%).
At K = 5 the number of cultivars classified as representative of a cluster (having more than 90% of their genome estimated to belong to a cluster) was 23, nine belonged to a cluster with membership probabilities between 75 and 90% while 57 could be considered as mixed (with membership probabilities < 75% for all clusters; Supplementary Table S3). Out of 26 cultivars bred by Bc/FAZ, six were classified as representative, all of them belonging to cluster D. On the other hand, seventeen out of 63 PIO cultivars classified as representative belong to all the detected clusters (A-E). The majority of cultivars from both programs were classified as mixed (Bc/FAZ: 69%; PIO: 62%).

Genetic Diversity and Population Structure Among Winter Wheat Cultivars from Different European Countries
To contextualize the genetic diversity of the CBP, a total of 166 polymorphic DArT markers were included in the diversity analysis of 523 European wheat cultivars from seven countries (Supplementary Table S2). The 166 DArT markers in the thinned, combined CBP/EBP dataset were relatively evenly distributed across the bread wheat genome, although there were only 15 markers on the D genome, with no markers on either 4D or 6D (Supplementary Table S4). Using Shannon's diversity index (Sh) it was found that genetic diversity varied from the highest levels found in Central European countries (Croatia: Sh = 0.83; Hungary: Sh = 0.73), toward the lowest, found in Nordic countries (Denmark: Sh = 0.61; Sweden: 0.60). The same pattern was observed for the expected heterozygosity (H E ) and PIC. The highest number of effective alleles (N E ) was found in Croatian and the lowest in the UK wheat pool (1.68 vs. 1.45). The rarity index (RI) value was 1.32 for Hungarian and 1.28 for Croatian wheat pool ( Table 2). Figure 3 represents the genetic relationship among cultivars defined by the first two axes of the FCA, which accounted for 71.28 and 11.89% of the total inertia, respectively. Cultivars from CE countries clustered separately from the cultivars from NWE countries along the first axis, suggesting that the CE cultivars represented genetic diversity outside the NWE breeding pool. Along the second axis the cultivars from Germany and Sweden tended to be plotted separately from French and UK cultivars while the Danish cultivars were plotted in the middle position.
One-way AMOVA showed the most of the genetic diversity was attributable to the differences among cultivars within countries (88.69%; Table 3). φ ST value among countries was moderate, but highly significant (φ ST = 0.11; P < 0.0001). When the geographical criterion (CE vs. NWE) was applied in the two-way AMOVA, highly significant difference between regions was obtained (φ ST = 0.16; P < 0.0001) that accounted for 16.19% of the total variance while the differences among countries within regions accounted for only 3.50% (Table 3). Pairwise φ ST values among countries ranged from 0.021 between France and UK to 0.277 between UK and Hungary with the average value of 0.122 ( Table 4). All the φ ST values were highly significant. Genetic differentiation between CBP and that of other countries was higher than average in all cases except in case of Hungary (φ ST = 0.074). As expected, the UPGMA tree based on inter-country φ ST values showed clear separation of CE cultivars from those belonging to NWE countries. In accordance with FCA results, the further subdivision between German and Swedish breeding pool from French, UK, and Danish cultivars was observed (Figure 4).
By analyzing the population structure of 523 cultivars from seven European countries using a Bayesian approach the similar pattern was observed, namely, that the average estimates of ln[Pr(X| K)] as well as the standard deviations among runs kept increasing with higher K (Figure 2B). The highest K value were observed for K = 2 (13075.54) followed by that of K = 3 (429.51), while all the subsequent Ks had substantially lower K values (<5). The clusters identified at K = 2 and K = 3 correspond well with those identified by FCA and UPGMA analysis based on pairwise φ ST values. At K = 2, the cluster A contained the majority of cultivars from NWE countries, while the majority of cultivars from CE countries belonged to cluster B (Figure 4). At K = 3, the cluster A was split into two subclusters (A1 and A2).
The most of the UK cultivars were assigned to the subcluster A1, the subcluster A2 comprised most of the cultivars from Germany and Sweden, while Danish and French cultivars were almost equally distributed among the two subclusters. The average proportion of membership of wheat cultivars in each of the seven European countries in each of the two (K = 2) and three (K = 3) clusters is presented in Figure 4. As in FCA and UPGMA analysis, the divergence between subclusters A1 and A2 was moderate in comparison to the split between the two subclusters and the subcluster B as shown by much larger net n -number of cultivars; %P -percentage of polymorphic markers; Sh -Shannon's diversity index; N E -effective number of alleles; H E -expected heterozygosity; PIC -Polymorphic information content; RI -Rarity index. * Averages followed by the same letter are not significantly different among countries (P < 0.05). nucleotide distance (provided by Structure) between A1 and B as well as A2 and B (0.19 and 0.13, respectively) than between A1 and A2 (0.06).

Genetic Diversity and Population Structure of Croatian Winter Wheat Breeding Pool
Understanding the level and structure of the genetic diversity of a crop is prerequisite for the conservation and efficient use of the available gerpmlasm for plant breeding (Laidò et al., 2013). Further, its monitoring can assist us in the choice of parents with desired alleles and for assessing changes in allelic frequencies (Christiansen et al., 2002). Analysis of Croatian wheat germplasm (89 cultivars) with DArT markers revealed different levels of genetic diversity with an average PIC of 0.30, the expected heterozygosity was 0.375 and effective number of alleles per locus was 1.641. This is within the range of values reported in previous studies on wheat diversity. For example, Stodart et al. (2007) found, when analyzing genetic diversity of 44 wheat landraces from five geographic regions, that 256 DArT markers had a non-adjusted PIC value of 0.43. Raman  P-values as obtained by 10,000 permutations: " * * * " corresponds to significance at the 0.1% nominal level and " * * " significance at the 1% nominal level. Our results suggest that only a few genetically similar wheat lines were used for creating genetic variability for breeding purposes and this reflects the historical course of Croatian wheat breeding. Beginning in the 20th century, hybridization was initiated and selections were made from local landraces such as Sirban Prolific and Bankuty wheats. More recently, Italian cultivars including Strampelli's cultivars (Carlottta Strampelli, Mentana, and San Pastore) were used as progenitors and sources of earliness, high yield capacity, and shorter straw. From the mid 1950s onward, domestic material was crossed mostly with wheat varieties originating from Italy, countries of former Soviet Union, the USA, and France (Denčić, 2001;Brandolini and Vaccino, 2012). The later differences in genetic diversity were introduced by breeders with preferences for different plant ideotypes using diverse sources of germplasm to realize their own breeding objectives.
At K = 5 the proportions of membership of PIO cultivars are more evenly distributed among clusters in comparison to those bred by Bc/FAZ leading to a higher total admixture of PIO cultivars based on Shannon's diversity index. It follows that Bc/FAZ breeding program relies on narrower genetic base that could be efficiently broaden by identifying suitable cultivars belonging to underrepresented subclusters (i.e., A, C, and E) and incorporating them into crossing schemes. However, the admixture attributable to among-cultivar component in case of PIO breeding program is higher than that attributable to withincultivar component due to the fact that the cultivars classified as representative of different cluster make 26%. It would suggest that the inter-cluster cultivar crossings is likely to result in novel combinations of promising traits.

Croatian Wheat Germplasm Diversity and Structure within European Context
Combining the genotypic information from the CBP with two additional EBP open-access datasets (Bentley et al., 2014;Nielsen et al., 2014) in the current study allowed a perspective view of the diversity within this regional program.
The results show higher genetic diversity and rare marker frequency in the winter wheat breeding pool from Central Europe (Croatia and Hungary; CE) as compared to the Northern and Western European (NWE) countries (Denmark, France, Germany, Sweden, and UK). In agreement with this, Huang et al. (2002) and Röder et al. (2002), using SSR markers, found the highest diversity in Southern European wheats as compared to those from Northern and Western Europe. The most plausible explanation for the observed levels of genetic diversity and its distribution may be due to the presence of relatively more alleles as a result of breeding practice (Roussel et al., 2005;Hai et al., 2007).
AMOVA showed the most of the genetic diversity was attributable to the differences among cultivars within countries (>80%), which is with the consensus of other reports (Hai et al., 2007). Roussel et al. (2005) and White et al. (2008) found that the geographical factor was more important than the temporal factor in explaining genetic structure, reflecting the wheat breeding practice of longer and more intense selection in NWE than CE, which had different objectives.
When genetic structure of our CBP/EBP wheat panel was analyed the highest K value was found at K = 2 ( Figure 2B), but at K = 3 there was further separation as well (Figure 4). Similar results were reported by Le Couviour et al. (2011) who found clear separation due to geographical specificity at K = 2 for wheat varietities originating from UK, Germany and France, but further separation within French cluster was not due to geographic origin or linked to the breeding companies, but rather reinforcing the idea that unique elite variety extensively used as progenitor caused this subdivision.
Further selective changes between NWE and CE wheat breeding pool is due to difference causes by latitude effects, where there is presence of photoperiod sensitive and insensitive alleles causing different adaptation patterns to variable environments. This provides context for the genetic diversity found and indicates that although the CBP has historically derived parental germplasm from various European and International programs (as above) the diversity retained within it is regionally distinct.
The use of a combined dataset in the current study shows that open-access datasets and common genotyping platforms represent an opportunity to use data-mining to unlock genetic potential. Although useful here for placing the CBP within the context of the wider EBP this approach was limited in this case due to the lack of overlapping DArT markers between the three discrete datasets.
In terms of usefulness in future studies (i.e., for incorporating phenotypic information for mapping) it is also holds limited potential due to the lack of genome-wide coverage, particularly on the D-genome. DArT markers have been superseded by SNP markers which can be applied either singly (e.g., as single marker assays), or in array form. The availability of affordable, high-density markers will add value to future diversity studies, although significant insight into genetic diversity has been possible using DArTs in the current study.

CONCLUSION
The current study emphasizes the important contribution made by plant breeders to European wheat genetic diversity as each program tends to represent a pool of regional divergence. This suggests that maintenance of crop diversity as a whole would benefit from an increase in the number of regional breeding programs rather than the consolidation that is often seen, particularly in commercial breeding. As demonstrated here, DArTs showed their efficacy in describing genetic diversity and population structure of the CBP. Likewise, they provided an insight into the distribution of genetic variance within a European context which is mostly held within, rather than between, geographic regions.

ACKNOWLEDGMENTS
This research was partially supported by the Croatian Ministry of Science, Education and Sports (Project No. 073-0730718-0536 and 073-0730718-0598). AB, NG, and RH were supported by grant BB/I002561/1 from the UK Biotechnology and Biological Sciences Research Council.