Association mapping and haplotype analysis of the pre-harvest sprouting resistance locus Phs-A1 reveals a causal role of TaMKK3-A in global germplasm

Pre-harvest sprouting (PHS) is an important cause of quality loss in many cereal crops and is particularly prevalent and damaging in wheat. Resistance to PHS is therefore a valuable target trait in many breeding programmes. The Phs-A1 locus on wheat chromosome arm 4AL has been consistently shown to account for a significant proportion of natural variation to PHS in diverse mapping populations. However the deployment of sprouting resistance is confounded by the fact that different candidate genes, including the tandem duplicated Plasma Membrane 19 (PM19) genes and the mitogen-activated protein kinase kinase 3 (TaMKK3-A) gene, have been proposed to underlie Phs-A1. To further define the Phs-A1 locus, we constructed a physical map across this interval in hexaploid and tetraploid wheat. We established close proximity of the proposed candidate genes which are located within a 1.2 Mb interval. An association analysis of diverse germplasm used in previous genetic mapping studies suggests that TaMKK3-A, and not PM19, is the major gene underlying the Phs-A1 effect in European, North American, Australian and Asian germplasm. We identified the non-dormant TaMKK3-A allele at low frequencies within the A-genome diploid progenitor Triticum urartu genepool, and show an increase in the allele frequency in modern varieties. In UK varieties, the frequency of the dormant TaMKK3-A allele was significantly higher in bread-making quality varieties compared to feed and biscuit-making cultivars. Analysis of exome capture data from 58 diverse hexaploid wheat accessions identified fourteen haplotypes across the extended Phs-A1 locus and four haplotypes for TaMKK3-A. Analysis of these haplotypes in a collection of UK and Australian cultivars revealed distinct major dormant and non-dormant Phs-A1 haplotypes in each country, which were either rare or absent in the opposing germplasm set. The diagnostic markers and haplotype information reported in the study will help inform the choice of germplasm and breeding strategies for the deployment of Phs-A1 resistance into breeding germplasm.


Introduction 72
Pre-harvest sprouting (PHS) refers to the too-early germination of physiologically 73 matured grains while still on the ear, but before harvest. PHS is primarily caused by 74 insufficient levels, or rapid loss, of seed dormancy and is an important cause of quality 75 loss in many cereal crops (Li et al., 2004;Fang and Chu, 2008). This is particularly 76 relevant in wheat due to its detrimental effects on bread-making potential which 77 represents the most common use of wheat grains globally (Simsek et al., 2014). PHS 78 is believed to be a modern phenomenon, as progenitor and wild wheat species  Selection for reduced seed dormancy during domestication and modern breeding 81 programmes allowed for more uniform seed germination and rapid crop establisment 82 (Nave et al., 2016). However, this also resulted in higher level of susceptiblity to PHS 83 in modern wheat varieties (Barrero et al., 2010). In addition to its detrimental effect 84 on quality, PHS also reduces yield and affects seed viability, making resistance to PHS 85 a high priority in many breeding programmes.

87
Occurrence of PHS is heavily influenced by the environment. PHS is prevalent in 88 wheat growing regions with high levels of rainfall during the period of grain 89 maturation and after-ripening. Increased ambient temperature during this period can 90 further increase the susceptibility of grains to sprouting (Barnard and Smith, 2009; 91 Mares and Mrva, 2014). This enviromental dependency of PHS constitutes a 92 constraint in selecting for PHS resistance in field conditions. In addition, resistance to 93 PHS is highly quantitative and is controlled by numerous quantitative trait loci (QTL)  Recently, two independent studies by Barrero et al. (2015) and Torada et al. (2016) 114 identified the tandem duplicated Plasma Membrane 19 (PM19-A1 and PM19-A2) 115 genes and a mitogen-activated protein kinase kinase 3 (TaMKK3-A) gene,

125
It is presently unclear whether the sprouting variation associated with Phs-A1 across 126 diverse germplasm is due to allelic variation at PM19 or TaMKK3 In this study, we characterised the Phs-A1 physical interval in both hexaploid and 134 tetraploid emmer wheat to establish the physical proximity of PM19 and TaMKK3-A. 135 We developed markers for the candidate genes, and showed TaMKK3 genes. Gene models that did not meet these criteria were considered as low confidence 184 genes, and were not analysed further. HapMap VCF for SNP sites located within these contigs. We kept SNP sites with 252 allele frequencies of >5 % and accessions with >80% homozygous calls across SNPs.

253
Allele information at the selected SNP loci was reconstructed for each line using the 254 reference, alternate and genotype field information obtained from the VCF. Haplotype

272
TaMKK3-A and PM19 are located within a 1.2 Mb physical interval. 273 We constructed an extended physical map across the Phs-A1 interval to investigate the included four BAC clones ( Figure 1A; Table S1).

283
Individual BACs were sequenced, assembled, repeat-masked and annotated for coding  genes was covered and estimated to be approximately 1.2 Mb (Figure 1).

312
TaMKK3-A is most closely associated with Phs-A1. 313 Torada  These results strongly support TaMKK3-A as the most likely causal gene for Phs-A1 350 across this highly-informative panel.

352
Origin and distribution of the TaMKK3 which the non-dormant A allele was found at a 15% frequency (Table S2).

381
To determine if the TaMKK3 dormant allele was associated with improved end-use

418
Haplotype structure at the Phs-A1 interval in UK and Australian germplasm 419 To characterise a larger set of European (Gediflux) and Australian germplasm, we 420 selected seven informative polymorphisms across seven genes from the HapMap 421 dataset and developed KASP assays for these (Table S4). Using these seven assays, 422 we defined 16 haplotypes in the European Gediflux collection (Table S5). This  By combining haplotype and pedigree information for these lines we could trace, to a  of relatedness amongst these lines relative to the entire collection (Table 2).

496
In support of this, the causal TaMKK3    Conflict of interest statement 591 The authors declare no conflict of interest.