Genomic Characterization of the Fruity Aroma Gene, FaFAD1, Reveals a Gene Dosage Effect on γ-Decalactone Production in Strawberry (Fragaria × ananassa)

Strawberries produce numerous volatile compounds that contribute to the unique flavors of fruits. Among the many volatiles, γ-decalactone (γ-D) has the greatest contribution to the characteristic fruity aroma in strawberry fruit. The presence or absence of γ-D is controlled by a single locus, FaFAD1. However, this locus has not yet been systematically characterized in the octoploid strawberry genome. It has also been reported that the volatile content greatly varies among the strawberry varieties possessing FaFAD1, suggesting that another genetic factor could be responsible for the different levels of γ-D in fruit. In this study, we explored the genomic structure of FaFAD1 and determined the allele dosage of FaFAD1 that regulates variations of γ-D production in cultivated octoploid strawberry. The genome-wide association studies confirmed the major locus FaFAD1 that regulates the γ-D production in cultivated strawberry. With the hybrid capture-based next-generation sequencing analysis, a major presence–absence variation of FaFAD1 was discovered among γ-D producers and non-producers. To explore the genomic structure of FaFAD1 in the octoploid strawberry, three bacterial artificial chromosome (BAC) libraries were developed. A deletion of 8,262 bp was consistently found in the FaFAD1 region of γ-D non-producing varieties. With the newly developed InDel-based codominant marker genotyping, along with γ-D metabolite profiling data, we revealed the impact of gene dosage effect for the production of γ-D in the octoploid strawberry varieties. Altogether, this study provides systematic information of the prominent role of FaFAD1 presence and absence polymorphism in producing γ-D and proposes that both alleles of FaFAD1 are required to produce the highest content of fruity aroma in strawberry fruit.


INTRODUCTION
Flavor and aroma are important characteristics of fruit quality in the cultivated octoploid strawberry (Fragaria × ananassa). Over 350 volatile compounds have been characterized in strawberry fruit including alcohols, aldehydes, esters, ketones, lactones, and terpenes (Latrasse, 1991;Urrutia et al., 2017). The major aroma compounds are identified as esters, lactones, furanones, sulfur compounds, and terpenoids (Perez et al., 1992;Zabetakis and Holden, 1997;Dong et al., 2013;Cruz-Rus et al., 2017). Among these flavor and aroma compounds, γ-decalactone (γ-D) has a desirable "fruity, " "sweet, " or "peachlike" aroma in strawberry fruit (Larsen et al., 1992;Jetti et al., 2007;Ulrich et al., 2007;Jouquand et al., 2008;Olbricht et al., 2008). Presence or absence of γ-D in strawberry fruit is controlled by a single gene, FaFAD1, encoding omega-6 fatty acid desaturase. This gene was mapped to the linkage group III (LG3) in the octoploid strawberry (Zorrilla-Fontanesi et al., 2012;Chambers et al., 2014;Sánchez-Sevilla et al., 2014). The abundance of γ-D is closely linked to FaFAD1 transcript levels, and a PCR-based marker has been developed, which mostly predicts presence/absence of both FAD1 and γ-D abundance (Cruz-Rus et al., 2017). However, the causal mutation that modulates γ-D abundance remains uncharacterized and prevents the development of a functional marker. While it has been hypothesized that the lack of γ-D is due to a deletion of the FaFAD1 (Chambers et al., 2014;Sánchez-Sevilla et al., 2014), the precise genomic context of the deletion has not been characterized. Therefore, additional sequencing and comparative analyses are required to characterize the genomic structure of the FaFAD1 region.
The concentration of γ-D varies widely among different γ-Dproducing accessions (Chambers et al., 2014); however, the potential genetic and environmental causes for this variation have not been identified. Gene copy number variation (CNV) and gene dosage effects could be major sources of trait variation, with examples including muscat flavor in grapevine, aluminum tolerance in maize, and flowering time in wheat (Díaz et al., 2012;Maron et al., 2013;Emanuelli et al., 2014). In previous studies, several gene-specific sequencetagged site (STS) and high-resolution melting (HRM) markers accurately predicted the presence and absence of FaFAD1 in strawberry cultivars (Sánchez-Sevilla et al., 2014;Noh et al., 2017). However, these dominant markers could not explain any possible gene dosage effects for variation in γ-D content. A functional codominant marker could help characterize gene dosage effects and provide increased efficiency in markerassisted breeding.
The modern cultivated strawberry (Fragaria × ananassa) is an allo-octoploid (2n = 8 × = 56), hybridization between a Chilean strawberry (F. chiloensis) and a North American native strawberry (F. virginiana) (Njunguna et al., 2013). The genome of diploid progenitor species Fragaria vesca was sequenced (Shulaev et al., 2011) and has been used as a diploid reference genome for gene-trait association studies in F. × ananassa and toward DNA marker development. However, the octoploid F. × ananassa genome is far more complicated than its diploid progenitors. In the last year, the chromosome-scale F. × ananassa 'Camarosa' reference genome was developed, and F. vesca was shown to be the dominant diploid progenitor in terms of gene content, expression abundance, and genetic control for metabolic and disease resistance traits (Edger et al., 2019). Two other diploid progenitors, F. iinumae and F. nipponica, have been recently fully sequenced 1 . The other new reference genome will serve as a powerful genetic resource to unravel complexity of the octoploid cultivated strawberry genome for gene-trait association studies.
Unlike the diploid genome, polyploid genomes have complex subgenomes that originated from the same/different diploid ancestors (Wendel, 1989(Wendel, , 2000Levy and Feldman, 2002;Wendel and Cronn, 2003;Aharoni et al., 2004;Edger et al., 2019). Thus, polyploid subgenomes will likely show variation in copy numbers at homoeologous loci. Duplicated homologs are often observed in different subgenomes, which leads to difficulties in the isolation of subgenome-specific sequences or gene variations related to phenotypic trait variations. In particular, when certain genomic information is missing in a reference genome, bacterial artificial chromosome (BAC) libraries carrying large insert genomic DNA can be an effective tool for map-based gene cloning and characterization of target genomic regions in polyploids (Klymiuk et al., 2018;Liu et al., 2018). Because of the complexity of the octoploid strawberry genome, the recent construction of high-quality subgenome-specific reference sequences significantly facilitates identification of quantitative trait loci (QTL) and development of DNA markers tightly linked to agronomical important traits. The current octoploid reference genome (cv. Camarosa) is not phased genome assembly, and thus it does not contain complete genome information of each haplotype. There is another octoploid reference sequence available from Japanese cultivar 'Reikou' , which is haplotyperesolved assembly 2 , but this draft genome sequence has not been fully characterized. Due to the high levels of genetic variations present in different breeding germplasm, it is not possible to explain all the genetic complexity of the octoploid cultivated strawberry with only two reference genomes. In addition, the two octoploid strawberry reference genomes do not have the FaFAD1 region, and this hinders dissection of the functional sequence variations linked to γ-D content among cultivated strawberry varieties. Because of these limitations, chromosome specific BAC libraries can be especially valuable in polyploid varieties as they avoid the problem of homoeology (Wang et al., 2007), and they further identify genomic regions of target traits that are absent in the reference genome sequence (Gordon et al., 1999).
This report presents a comprehensive characterization of the genomic context of the FaFAD1 deletion, along with functional data showing that the gene is both necessary and sufficient for γ-D production. A dosage effect has been observed between the allelic states, allowing the development of co-dominant DNA markers that can be used to assess a genotype's ability to produce this important flavor volatile.

Physical Mapping of the Major Locus FaFAD1 That Regulates the Production of γ-D in the Octoploid Strawberry
To characterize the genomic region of FaFAD1 in the octoploid cultivated strawberry, we conducted multiscale genomic approaches as shown in Figure 1. The locus controlling γ-D was previously identified on linkage group LGIII-2 (Sánchez-Sevilla et al., 2014). RNA-sequencing analysis identified a candidate gene, FATTY ACID DESATURASE GENE 1 (FaFAD1), responsible for the presence/absence of γ-D in the octoploid strawberries (Chambers et al., 2014). To determine the physical location of the FaFAD1 locus in the octoploid strawberry genome, IStraw35 whole genome SNP genotyping was conducted, with a genome-wide association study (GWAS) identifying a single peak for γ-D biosynthesis (h 2 = 0.536) on chromosome group 3 (Figure 2A). To identify the subgenome harboring FaFAD1, 11 significant SNP probes were aligned to the four subgenomes of chromosome group 3 of the 'Camarosa' reference genome ( Table 1). All of the 11 probes were located from 25.2 to 31.3 M on subgenome 3-2. Single-marker analysis with AX-166512929 that is closely associated with the FaFAD1 locus suggests that this QTL is responsible for the natural variations of γ-D in the octoploid strawberry accessions ( Figure 2B). There was also significant variation in γ-D content between homozygous (AA) and heterozygous (AB) accessions ( Figure 2B).
Agroinfiltration experiments were used to test the requirement of FAD1 for γ-D production. Developing fruits of a γ-D-producing genotype were treated with Agrobacterium cultures containing an RNAi construct targeting FaFAD1. Decreased accumulation of FaFAD1 transcripts was related to a significant reduction in γ-D ( Figure 2C). These results substantially provide evidence that the FaFAD1 gene is a key regulator for the production of γ-D in the octoploid strawberry.

A Large Deletion Was the Only Genomic Variation Linked to γ-D Production
The published allele for FaFAD1 (Sánchez-Sevilla et al., 2014) was not detected in the 'Camarosa' reference genome, which is consistent with this cultivar not producing γ-D. Thus, it is not possible to examine the genomic structure of FaFAD1 using the current reference genome for the octoploid strawberry. Using hybridization capture-based target enrichment sequencing, we investigated the 100 kb region of FaFAD1 to examine for the potential genomic variants that could be linked to γ-D content. Hybridization probes for the genomic region of FaFAD1 were designed using the F. vesca genome sequence. A total of 57,308,440 paired-end reads were generated for the target region of 100 kb, and de novo assembly was performed on 15 individuals (11 γ-D producers and four γ-D non-producers). A number of SNPs were found in the 100 kb region containing FaFAD1, but none correlated with the production of γ-D.
The large deletion of FaFAD1 occurred only in γ-D nonproducers, including the genotypes 'Winter Dawn' , 'Mara des Bois' ,'Strawberry Festival' , (Figure 3). In the FaFAD1 genomic region, any CNVs associated with the FaFAD1 gene did not present in all 11 γ-D producers. Because Illumina short reads (150 bp pair end) were used in this analysis, it was not possible to differentiate subgenome-specific sequence variations or determine the exact size of the insertion/deletion.

The Deletion of FaFAD1 Dictates γ-D Variation in the Octoploid Cultivated Strawberry
In order to exploit the genomic region of FaFAD1 in γ-Dproducing accessions, three BAC libraries developed from the University of Florida strawberry breeding accessions  were screened with two gene-based markers, UFGDHRM5 (Noh et al., 2017) and qFaFAD1 (Sánchez-Sevilla et al., 2014) (Supplementary Tables 1, 3). Because both 'Camarosa' and 'Reikou' do not contain FaFAD1, the genomic structure could not be characterized from the reference sequence.
In the super pool (SP) screening using agarose gel electrophoresis with UFGDHRM5 and qFaFAD1 markers, 2-4 positive clones were identified from each of the BAC libraries, and further initiated to matrix pool (MP) screening for all libraries (Supplementary Figures 1, 2). In the result of MP screening, a total of six positive clones from two BAC libraries of γ-D producers, FL 11.77-96 and 'Florida Brilliance' , were identified and further processed for sequencing. Illumina short reads from six BAC clones yielded approximately 7.7 million reads (average 149 bp length) and de novo assembly was performed (Supplementary Figure 3 and Supplementary Table 2). All of the six BAC clones generated 3-9 contigs with an average length of 135 kb. To determine sequence variations in the genomic region of FaFAD1 among γ-D producers and γ-D non-producers, the contig sequences from all six BAC clones were aligned to the two octoploid reference sequences (cv. Camarosa and Reikou). A deletion of 8,262 bp, including 2,323 bp for FaFAD1, was consistently found in γ-D non-producers 'Camarosa' and 'Reikou' (Figure 4A and Supplementary Figure 4).
To confirm the difference in the genomic structure of the 8,262 bp indel region between γ-D producers and non-producers, whole genome sequencing reads of two breeding accessions  were mapped to the clone, BAC_GD004, from library BB1 ('Florida Brilliance') ( Figure 4B). Illumina sequence reads were evenly mapped to the 8,262 bp insertion region in the γ-D producer, FL 10-92, while reads associated with FaFAD1 were missing in the γ-D non-producer, FL 13.55-195. Some short reads were mapped to the flanking region of FaFAD1 in the non-producer, because these reads are from other homeologous subgenomes of chromosome 3. The reads mapped to the promoter and coding region of FaFAD1 are from the homologous gene located in chromosome 6. This homologous gene is not functionally associated with the production of γ-D production. Furthermore, to examine if the 8,262 bp presence/absence polymorphism is consistently present in other cultivated strawberry accessions,  the long-range PCR was conducted with each of five γ-D producers and γ-D non-producers. All five γ-D producers tested had a PCR product of the expected size (9,209 bp), while 947 bp was amplified in other γ-D non-producers due to the absence of 8,262 bp ( Figure 4C). Taken altogether, our results strongly indicate that the presence/absence variation of FaFAD1 FIGURE 3 | Hybrid-probe captured sequencing for the 100-kb flanking region of FaFAD1 and comparative analysis among γ-D producers and γ-D non-producers. Illumina short reads (150 bp PE) were assembled to the FvFAD1 region from the F. vesca reference genome (Edger et al., 2018). Each line represents a 100-kb flanking region of FaFAD1 for 15 genotypes. The dark black lines represent deleted genomic sequences in the FaFAD1 region.
region is directly related to the production of γ-D in the octoploid strawberry.

Codominant Marker Reveals Gene Dosage Effects on γ-D Production
It has been demonstrated that the content of γ-D varies among producers (Chambers et al., 2014). As shown in Table 2, the amount of γ-D in 48 accessions, 38 with FaFAD1 and 10 accessions without FaFAD1, was measured using gas chromatography/mass spectroscopy (GC-MS) ( Table 2). The content of γ-D is widely diverse among the 38 producers, while γ-D is not detectable in γ-D non-producers. To examine  Table 3). HRM and KASP markers developed from the flanking region of FaFAD1 were used to amplify a common genomic sequence in both γ-D producers and non-producers to examine the FaFAD1 gene dosage effects (Supplementary Figure 4). All tested markers including a dominant marker (UFGDHRM5) detected the presence and absence of FaFAD1 in the accessions (Figure 5 and Table 2). It was of interest to test if variation was associated with dosage imparted from   to BAC_GD004. (C) Long-range PCR results detect an 8,262 bp deletion between five γ-D producers and five γ-D non-producers. All PCR amplicons were similar to the expected size of 9,208 bp for γ-D producers and 947 bp for γ-D non-producers.
the homo-or heterozygous state of the FaFAD1 gene. The NGD001 and UFGDKASP markers distinguished between heterozygous (Aa) and homozygous (AA) genotypes among γ-D producers ( Figure 5 and Table 2). The relative abundance of γ-D was significantly higher in homozygous than heterozygous producers ( Figure 6A). Steady-state transcript levels for FaFAD1 between the homozygous (AA) and heterozygous (Aa) γ-D producers were measured via qRT-PCR ( Figure 6B). We found that transcript levels of the homozygous "AA" genotypes were significantly higher than the heterozygous "Aa" genotypes. The differences in transcript abundance match well with metabolite accumulation, indicating that the allele dosage is the cause for the variations of γ-D content in the octoploid cultivated strawberry.

DISCUSSION
Strawberry fruit has one of the most complex aromas and it has been reported that differences in aroma volatiles can improve strawberry fruit quality (Pelayo-Zaldivar et al., 2005;Dong et al., 2013). Several studies described the importance of key compounds of aroma production and the genetic loci or genes that control fruity aroma in peach (Eduardo et al., 2013;Pirona et al., 2013), apple (Dunemann et al., 2009;Costa et al., 2013), and octoploid strawberry (Aharoni et al., 2004;Zorrilla-Fontanesi et al., 2012). One of the key compounds, γ-decalactone, has a distinctly peach aroma, which increases the perception of sweetness in both peach (Zhang et al., 2010;Zhu and Xiao, 2019) and strawberry fruits (Schwieterman et al., 2014;Ulrich and Olbricht, 2016). While strawberries have a unique combination of sugars, acids, and volatile organic compounds (VOCs), some favorable flavors have been neglected due to the limited resources of strawberry breeding programs (Folta and Klee, 2016;Ulrich and Olbricht, 2016). In recent years, sensory qualities have become popular targets for genetic improvement, including qualities such as sweetness and unique flavors (Latrasse, 1991;Cruz-Rus et al., 2017). FaFAD1 is a priority for these reasons, as it enhances the fruity flavor and sweetness of   strawberries. This report and others have shown that the presence/absence and nature of the causal biosynthetic gene make γ-D an attractive target for marker-assisted breeding for enhanced flavor and aroma.
With the analysis of GWAS, a major locus, FaFAD1, controlling the production of γ-D volatile was identified at octoploid chromosome 3-2 ( Table 1). Although the content of γ-D is controlled by one major locus (a candidate gene FaFAD1), the accumulation of γ-D varies widely (Larsen et al., 1992;Jetti et al., 2007;Zorrilla-Fontanesi et al., 2012;Chambers et al., 2014;Sánchez-Sevilla et al., 2014). These reports indicate that the physical presence/absence of the FaFAD1 gene correlates with γ-D accumulation. In this report, the FaFAD1 transcript was silenced using Agroinfiltration and RNAi in a γ-D-producing cultivar, demonstrating functionally that the gene is both necessary and sufficient for γ-D accumulation and that the volatile is not controlled by another element inside the deleted region.
To sequence the genomic region containing the functional FaFAD1 allele, probe capture sequencing of the FaFAD1 flanking region (100 kb) was performed on 11 γ-D producers and four γ-D non-producers. The multiple sequence alignment between the 15 genotypes evidently showed the absence of FaFAD1 in γ-D nonproducers (Figure 3). Our findings indicate that a single copy of FaFAD1 located in chromosome 3 is responsible for production of γ-D.
Among polyploid genomes in general, high sequence similarity between homoeologous chromosomes is often a challenge in differentiating subgenomes (Mandáková et al., 2014). The subgenomes of polyploid plants are generally large and contain extensive repeats, which can greatly impede genome assembly resulting in non-contiguous and incorrect assemblies (Saski et al., 2017). Therefore, it remains challenging to assemble short-read sequences to specific subgenomes (Birchler and Veitia, 2012;Madlung and Wendel, 2013;Mandáková et al., 2014).
To overcome these difficulties, BAC-based physical maps combined with high-density genetic maps were used to improve the accuracy of whole-genome sequence assemblies (Luo et al., 2010;Ariyadasa and Stein, 2012;Sierro et al., 2013). Wholegenome sequencing and BAC libraries are physical and lasting genomic resources that have critical value as tools for positional cloning of genes and associated regulatory sequences (Saski et al., 2017). Here, the causal FaFAD1 deletion region in the commercial octoploid strawberry was successfully characterized by using the BAC library sequencing approach (Figure 4).
In this study, approximately 100-kb FaFAD1 flanking region of alleles was obtained through the six BAC clones. When comparing these BAC clones to the two reference genomes cv. Camarosa and Japanese cv. Reikou, which are γ-D nonproducers, it was confirmed that there is an 8,262 bp deletion in subgenome 3-2 of 'Camarosa' and Ch3Bib of 'Reikou' (Figure 4A). Moreover, it was confirmed that γ-D non-producers have the same deletion size of 8,262 bp through long-range PCR using γ-D producers and γ-D non-producers ( Figure 4C). While these results are confined to University of Florida cultivars, it is likely that the same deletion will be a useful diagnostic tool across strawberry varieties. The two newly developed codominant markers, NGD001 and UFGDKASP, are much improved from the previous dominant marker (Noh et al., 2017) in terms of accuracy and throughput and permit improved parental selection and rapid screening of progeny for homozygous plants likely to be higher γ-D producers. It would be more valuable to further examine the deletion in diverse strawberry accessions from different breeding programs. Osborn et al. (2003) reported that the allele dosage is expected to relate to different levels of gene expression that affect target traits. Thus, it is highly possible that polyploids could increase the potential variation in their gene expression, which reflects in phenotypic variation. The inclusion of increased variation in dosage-regulated gene expression has become important for genetic studies in polyploid species (Garcia et al., 2013;Hackett et al., 2013;Endelman et al., 2018). To determine allele dosage effects in γ-D producer accessions, the content of γ-D volatile was measured by GC-MS in 48 advanced selections. The genotype results from the two codominant markers were strongly related to γ-D abundance among the 38 γ-D producers ( Table 2). In addition, the expression level of FaFAD1 was related to the zygosity of the gene (Figure 6). This result indicates that there is a dosage effect on γ-D production.
In summary, the precise location of FaFAD1 was confirmed in chromosome 3-2 of the octoploid strawberry genome. By utilizing hybridization-based targeted enrichment sequencing and the BAC libraries, a major presence and absence polymorphism (8,262 bp) was substantially found within the FaFAD1 locus, which is associated with γ-D production in the octoploid strawberry. This FaFAD1-containing deletion was present only in γ-D non-producers and directly responsible for lack of fruity aroma flavor. The newly developed codominant markers for FaFAD1 from this study revealed unique information for the allele dosage effect on the content of γ-D. It provides the new evidence of allele dosage effect on volatile synthesis, suggesting that altering allele number can be a potential tactic for genetic improvement of fruit flavor. Our results provide a directly translatable resource for strawberry breeders and research communities, which will further facilitate the development of new strawberry cultivars with improved flavor.

Genome-Wide Association Study
For the GWAS of γ-decalactone, we used the three crossing populations (n = 59) that were derived from the crosses 'Florida Elyana' × 'Mara de Bois' (population 10.113), 'Mara des Bois' × 'Florida Radiance' (population 13.75), and 'Strawberry Festival' × 'Winter Dawn' (population 13.76) published in Barbey et al. (2020). GWAS for γ-decalactone was conducted via a mixed linear model approach using the GAPIT v2 package (Tang et al., 2016) in R software version 3.3.1. We consider GWAS associations significant at an alpha of 0.05 AFTER correction for FDR. SNP markers from the Affymetrix IStraw35 Axiom Array (Bassil et al., 2015;Verma et al., 2016) were mapped to the diploid F. vesca physical map, as available genetic maps in octoploid do not include a majority of the IStraw35 markers used in this study.

Transient Assay
Transient expression in strawberry fruits by agroinfiltration was performed according to Hoffmann et al. (2006). Agrobacterium tumefaciens strain EH105 containing the RNAi vector was used to perform transient expression analysis and its effect on γ-D abundance in strawberry fruits. The culture was grown at 28 • C overnight, and then the bacterial culture was resuspended in infection buffer (10 mM MgCl 2 , 100 µM acetosyringone, and 10 mM MES, pH 6.0) and shaken for 4 h at room temperature before infiltration of fruits. After the infection, the fruits were collected for gene expression and measurement of γ-D. For quantitative reverse transcription PCR (qRT-PCR), total RNA from the fruits was isolated as described by Pillet et al. (2017) and 1 µg of total RNA was used to synthesize the cDNA. For the qRT-PCR assay, a transcript corresponding to a conserved hypothetical protein FaCHP1 (Clancy et al., 2013) was used as a constitutive reference. The qRT PCR was run on an Applied Biosystems StepOnePlus Real-Time PCR System using StepOne Software (v2.0) (Applied Biosystems, Foster City, CA, United States). The qRT-PCR data were analyzed using the comparative CT method ( CT) following the manufacturer's direction. The gene expression of FaFAD1 was run with three technical replicates and repeated for at least three biological replicates.

Hybridization-Based Targeted Enrichment Sequencing and Data Analysis
Targeted enrichment of FaFAD1 genomic region was performed by hybrid capture-based next-generation sequencing (NGS) using a synthetic library consisting of a final set of 2,000 Axiom R IStraw35 384HT array probes (Cronn et al., 2012;Verma et al., 2016). The capture oligonucleotides were 150 nt long and were designed to target 100 kb of FvFAD1 (Fvb3: 31,039,796-31,149,795, F. vesca genome v2.0 a2). Probes were designed for covering the entire 100 kb genomic region of FaFAD1 with 3 × coverage depth. The targeted capture sequencing libraries were constructed using Nextera Rapid Capture Custom Enrichment Kit. The captured sequence was quality controlled using Agilent 2100 Bioanalyzer. All sequencing was accomplished on the Illumina HiSeq 2000 using paired-end 100-bp reads following standard manufacturer protocols. Fifteen accessions were used for this experiment; 11 γ-D producers, ' Albion' , 'Benicia' , 'Florida Elyana' , 'Florida Radiance' , 'Sweet Charlie' , Sweet Sensation R 'Florida 127' , 'Winterstar' , , and 4 γ-D nonproducers, 'Mara des Bois' , 'Strawberry Festival' ,'Winter Dawn' , Raw FASTQ files were first checked using the FastQC tool 3 . Raw short reads from each capture library was mapped to a 100-kb region of FvFAD1 downloaded from Genome Database for Rosaceae (GDR 4 ) with CLC Genomics Workbench 11.0 5 , using the following parameters: match score 1, mismatch cost 2, cost of insertions and deletions = linear gap cost, insertion cost = 3, deletion cost = 3, length fraction = 0.8, and similarity fraction = 0.9. Consensus sequences were generated from all runs giving a total of 15 accession sequences, which were exported in fasta sequence. The consensus sequences were imported into Geneious 11.0.5 6 and Multiple Sequence Alignment (MSA) was performed using the geneious alignment algorithm.

BAC Library Construction and Screening
Three BAC libraries were prepared from the etiolated leaf tissues of three strawberry accessions, consisting of two γ-D producers (FL 11.77-96 and 'Florida Brilliance') and one γ-D non-producer   (Supplementary Table 1). For the tissue etiolation, strawberry plants were covered with black plastic bags and kept in a greenhouse for 2 weeks. The etiolated young white leaf tissues were collected for the DNA extraction. The preparation of high-molecular-weight DNA and library construction was performed at Amplicon Express Inc. (Pullman, Washington). The BamHI or HindIII digested genomic fragments were cloned into a BAC library vector. The recombinant vector was used to transform DH10B Escherichia coli competent cells (Invitrogen, CA, United States). The library was stored in 384-well plates filled with 50 µl freezing LB Lennox medium [36 mM K 2 HPO 4 , 1.7 mM sodium citrate, 0.4 mM MgSO 4 , 6.8 mM (NH 4 ) 2 SO 4 , 4.4% glycerol, and 12.5 lg/ml chloramphenicol]. Plates were incubated for 18 h, replicated, and kept at −80 • C for long-term storage. Two libraries HindIII and BamHI for each accession comprise a total of 32,000 clones, which represent approximately 5 × octoploid strawberry genome coverage. Each library set comprised of 16 super pools (SPs) with individual SPs representing BAC DNA from 10,000 clones for each enzyme. Each SP was divided into 23 MPs each consisting of three plate pools (PPs), eight row pools (RPs), and 12 column pools (CPs). Each library contains approximately 30,000 clones with an average insert size of 150 kb. Three BAC libraries provide about 15 × genome coverage of the octoploid strawberry, respectively.
To identify BAC clones containing FaFAD1, two genespecific markers, UFGDHRM5 (Noh et al., 2017) and qFaFAD1 (Sánchez-Sevilla et al., 2014), were tested for all three libraries (Supplementary Table 3). PCR reactions were performed with 16 SPs and 23 MPs with controls, following an online tool from Amplicon Express 7 . All PCR amplifications were performed in 5-µl reactions containing 2 × AccuStart TM II PCR ToughMix R (Quantabio, MA, United States), 1 × LC Green R Plus melting dye (BioFire, UT, United States), 0.5 µM of each HRM primer set, and 0.5 µl of DNA. The reaction conditions were as follows: 95 • C for 3 min, 45 cycles at 95 • C for 10 sec, 45 cycles at 62 • C for 10 sec, and 45 cycles at 72 • C for 20 sec.
The plasmid DNA from the BAC clones was isolated using the QIAGEN plasmid midi kit (QIAGEN, Hilden, Germany). For the NGS, each DNA was sequenced by an Illumina high-throughput sequencer with 150-bp paired-end (PE) sequencing strategy. The obtained sequence reads from Illumina were assembled using the de novo assembly program from CLC Genomics Workbench, version 11.0 8 .

Development and Validation of Functional Codominant Marker
The 48 accessions were used for marker development and validation. DNA extraction was performed using the simplified CTAB method with modifications (Noh et al., 2017). BACcontaining sequences from γ-D-producing accession  were compared to the octoploid reference genome 'Camarosa' , which does not produce γ-D. Primers were designed from the polymorphic sequences using IDT's PrimerQuest Software (San Jose, CA, United States). PCR amplifications were performed in a 5-µl reaction containing 2 × AccuStart TM II PCR ToughMix R (Quantabio, MA, United States), 1 × LC Green R Plus melting dye (BioFire, UT, United States), 0.5 µM of each HRM primer sets, and 1 µl of DNA. The PCR and HRM analysis were performed in a LightCycler R 480 system II (Roche Life Science, Germany) using a program consisting of an initial denaturation at 95 • C for 5 min; 45 cycles of denaturation at 95 • C for 10 sec, annealing at 62 • C for 10 sec, and extension at 72 • C for 20 sec. After PCR amplification, the samples were heated to 95 • C for 1 min and cooled to 40 • C for 1 min. Melting curves were obtained by melting over the desired range (60-95 • C) at a rate of 50 acquisitions per 1 • C. Melting data were analyzed using the Melt Curve Genotyping and Gene Scanning Software (Roche Life Science, Germany). Analysis of HRM variants was based on differences in the shape of the melting curves and in T m values.
KASP was designed using the 3CR Bioscience website 9 , and the genotyping assay was performed using a LightCycler R 480 system II (Roche Life Science, Germany). KASP assays were conducted in 5 µl reactions with 50 ng of DNA template, 2 × PACE TM Genotyping Master Mix (3CR Bioscience, United Kingdom), and PACE TM Assay mix (UFGDKASP-allele-1-FAM, UFGDKASP-allele-2-HEX, and one reverse primer UFGDKASP-common) in a LightCycler R 480 Multiwell 384well plate. PCR optimization was carried out as follows: initial denaturation at 94 • C for 15 min, a touch-down step followed by 10 cycles of 94 • C for 20 sec, and 61 • C decreasing 0.6 • C per cycle to a final annealing/extension temperature of 55 • C for 1 min, followed by 33 cycles of 94 • C for 20 sec, annealing/extension temperature of 55 • C for 1 min, with a final genotyping stage of 37 • C for 1 min, a suitable temperature for KASP detection, and finally a plate reading at 37 • C for 1 sec. KASP genotyping data were analyzed using Endpoint Genotyping Software (Roche Life Science, Germany).

Long-Range PCR for the FaFAD1 Region
Tissue samples were collected from five homozygous genotype "AA" accessions  and five "aa" accessions ('Camarosa' , 'Florida Festival' , 'Mara des Bois' , 'Treasure' , and 'Winter Dawn'). The DNA was extracted following the simplified CTAB method with modifications (Noh et al., 2017). The primer was designed to have a PCR amplicon size of 9,208 bp in γ-D producers and 947 bp in γ-D non-producers. PCR amplifications were performed in a 10-µl reaction containing Accustart Long Range SuperMix (Quantabio, MA, United States), 0.5 µM of each HRM primer sets, and 2 µl of DNA. PCR was performed in a ProFlex PCR system (Applied Biosystems, United States) using a program consisting of an initial denaturation at 95 • C for 3 min followed by 35 cycles of denaturation at 92 • C for 30 sec, annealing at 65 • C for 10 min, and final extension at 72 • C for 10 min. PCR products were separated by agarose gel electrophoresis (1.0%), stained with SYBR R Safe DNA gel stain (Invitrogen, Perth, WA) in TAE-Buffer (TRIS-Acetate-EDTA-buffer, pH 8), and visualized with UVP GelStudio Plus touch imaging system (Analytik Jena, AG Germany).

Volatiles Analysis
The fruit of 48 strawberry accessions was selected in the field of the Gulf Coast Research and Education Center (GCREC) in Wimauma, Florida, during the 2017/2018 season. Harvest dates were December 20, 2017, January 10, 2018, and January 29, 2018. The experiment was done with three biological replicates. Each replicate has at least 6-8 fruits for each genotype. All fruits of the 48 accessions were analyzed for the volatile γ-D using GC-MS in the 2018 season. Fruit processing for volatile analysis was conducted as follows: 4-6 fully ripe, clean, and normal-shaped berries were harvested from each genotype, the calyx from each berry was removed, and approximately 25 g of flesh was collected. Each sample was frozen in liquid nitrogen and stored at −80 • C until GC-MS analysis. Each sample was blended with an equal weight of saturated 35% NaCl solution. The volatile 3-hexanone was added as an internal standard to a final concentration of 1 ppm prior to blending. A total of 5 ml aliquots was taken from each sample and dispensed into 20 ml glass vials and sealed with magnetic crimp caps (Gerstel, Baltimore, MD, United States). Samples were frozen at −20 • C until analysis by GC-MS.
For GC-MS, a 2 cm tri-phase SPME fiber (50/30 µm DVB/Carboxen/PDMS, Supelco, Bellefonte, PA, United States) was used to collect and concentrate volatiles prior to running on an Agilent 6890 GC coupled with a 5973 N MS detector (Agilent Technologies, Palo Alto, CA, United States). Before analysis, samples were held at 4 • C in a Peltier cooling tray attached to a MPS2 autosampler (Gerstel). An authentic γ-D standard (Sigma Aldrich, St. Louis, MO, United States) was run under the same chromatographic conditions as berry samples for verification of volatile identification. The area of each γ-D peak was normalized to the peak area of the internal standard, and normalized peak areas were compared between samples.

Quantitative Real-Time PCR Analysis
For total RNA extraction, five homozygous "AA" genotypes  and five heterozygote "Aa" genotypes  were collected from a FL 18.50 advanced selection population. 'Camarosa' , 'Strawberry Festival' , 'Mara des Bois' , 'Treasure' , and 'Winter Dawn' with homozygote genotype "aa" were included as the reference genotype of γ-D non-producer. The experiment was performed with three biological replicates, and each replicate contains at least 5-8 fruits. Total RNA was extracted according to the protocol of the Spectrum TM Plant Total RNA Kit (Sigma-Aldrich, St. Louis, MO, United States) and resuspended in a total volume of 50 µl of RNase-free water. cDNA synthesis was performed with the Luna Script RT Super Mix Kit (New England Biolabs. Ipswich, MA). Primer sequence of FaFAD1, GD_Rt1, was obtained from NCBI (GeneBank Accession: KF887973.1). Strawberry GAPDH gene (GeneBank Accession: AB363963.1), FaGAPDH2, was selected as an endogenous control (Supplementary Table 3). qRT-PCRs were carried out with a LightCycler R 480 system II (Roche Life Science, Germany) using Forget-Me-Not TM qPCR Master Mix (Biotium, Corporate Place Hayward, CA, United States). The PCR contents were carried out in a total volume of 5 µl, which contained 1 µl of template cDNA, 0.2 µl of each primer, 2.5 µl of EvaGreen qPCR Master Mix, and 1.1 µl of doubledistilled H 2 O. The reaction conditions were 95 • C for 5 min, followed by 40 cycles of 95 • C for 20 sec, 60 • C for 20 sec, and 72 • C for 20 sec. This was followed by melting curve analysis, which was performed to confirm that each amplicon has a single product. The qRT-PCR experiment was repeated with at least three technical replications for each biological replicate. Relative fold difference was calculated by using the 2 − Ct method (Paolacci et al., 2009).

Data Analysis and Experimental Design
For all comparisons of means among genotypes (AA, Aa, and aa), t-test was performed using R software version 3.3.1 (R Core Team, 2017). The procedures and strategies of this study for characterizing the FaFAD1 region are summarized in Figure 1.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material. The sequence data of BAC_GD004 is available at this NCBI website (https://www.ncbi.nlm.nih.gov/, accession number MW584663).

AUTHOR CONTRIBUTIONS
YO performed the DNA extractions, bioinformatics study, HRM and KASP marker development, volatile analysis, and data analysis, and drafted the manuscript. CRB performed genomewide association study, bioinformatics study, and data analysis. SC performed hybridization-based capture sequencing. ZF, JB, and AP participated in volatile analysis. KMF and JP conceived and performed in the transient assay. VMW participated in its design and coordination. SL conceived of the study, data analysis, and drafted the manuscript. All authors have read and approved the final manuscript.