Genetic Analysis of Methyl Anthranilate, Mesifurane, Linalool, and Other Flavor Compounds in Cultivated Strawberry (Fragaria × ananassa)

The cultivated strawberry (Fragaria × ananassa) is an economically important fruit crop that is intensively bred for improved sensory qualities. The diversity of fruit flavors and aromas in strawberry results mainly from the interactions of sugars, acids, and volatile organic compounds (VOCs) that are derived from diverse biochemical pathways influenced by the expression of many genes. This study integrates multiomic analyses to identify QTL and candidate genes for multiple aroma compounds in a complex strawberry breeding population. Novel fruit volatile QTL was discovered for methyl anthranilate, methyl 2-hexenoate, methyl 2-methylbutyrate, mesifurane, and a shared QTL on Chr 3 was found for nine monoterpene and sesquiterpene compounds, including linalool, 3-carene, β-phellandrene, α-limonene, linalool oxide, nerolidol, α-caryophellene, α-farnesene, and β-farnesene. Fruit transcriptomes from a subset of 64 individuals were used to support candidate gene identification. For methyl esters including the grape-like methyl anthranilate, a novel ANTHANILIC ACID METHYL TRANSFERASE–like gene was identified. Two mesifurane QTL correspond with the known biosynthesis gene O-METHYL TRANSFERASE 1 and a novel FURANEOL GLUCOSYLTRANSFERASE. The shared terpene QTL contains multiple fruit-expressed terpenoid pathway-related genes including NEROLIDOL SYNTHASE 1 (FanNES1). The abundance of linalool and other monoterpenes is partially governed by a co-segregating expression-QTL (eQTL) for FanNES1 transcript variation, and there is additional evidence for quantitative effects from other terpenoid-pathway genes in this narrow genomic region. These QTLs present new opportunities in breeding for improved flavor in commercial strawberry.


INTRODUCTION
The dessert strawberry (Fragaria × ananassa) is a widely celebrated fruit with increasing consumption. For decades, consumers have reported the desire for improved flavor in commercial strawberry (Fletcher, 1917;Chambers, 2013). The aroma intensity of modern cultivars is lower than in wild strawberries (Ulrich and Olbricht, 2014), and breeding efforts seek to reclaim these qualities. Today, flavor and aroma are central priorities of strawberry breeding programs (Faedi et al., 2002;Whitaker et al., 2011;Vandendriessche et al., 2013). However, breeders face a significant challenge in the recapture and consolidation of genetics contributing to favorable flavors and aromas. Genetic and genomic analysis has been used to identify these elements and contribute to the breeding of new cultivars with improved sensory qualities.
Strawberry flavor and aroma are dictated by several factors, including sugars and acids, but it is the trace volatile organic compounds (VOCs) that shape the sensory experience (Bood and Zabetakis, 2002). VOCs, represented broadly as esters, alcohols, terpenoids, furans, and lactones, are a substantial portion of the fruit secondary metabolome and contribute to aroma, flavor, disease resistance, pest resistance, and overall fruit quality (Ulrich et al., 1997;Arroyo et al., 2007). Various studies have helped to identify human preferences for individual strawberry aroma and flavor compounds (Larsen and Poll, 1992;Schieberle and Hofmann, 1997;Ulrich et al., 1997;Schwieterman et al., 2014). Of hundreds of strawberry VOCs, these studies agree on fewer than 10 that clearly influence human preference (Schwieterman et al., 2014). Introgressing important compounds into commercially viable cultivars has been aided by efforts in volatilomic QTL detection (Urrutia et al., 2017), multiomic identification of VOC candidate genes (Chambers et al., 2014;Pillet et al., 2017), integration of sensory and consumer preference data (Sánchez-Sevilla et al., 2014;Schwieterman et al., 2014), and ultimately introgression of genes via marker-assisted selection (Eggink et al., 2014;Folta and Klee, 2016;Rambla et al., 2017).
Only a few strawberry genes controlling desirable aroma compounds have been identified with confidence. These include biosynthesis genes for linalool (Aharoni et al., 2004), mesifurane (Wein et al., 2002), γ-decalactone (Chambers et al., 2014;Sánchez-Sevilla et al., 2014), and methyl anthranilate (Pillet et al., 2017). The gene ANTHRANILIC ACID METHYL TRANSFERASE (FanAAMT), located on octoploid chromosome group 4, is a necessary-but-not-sufficient gene for catalyzing the methylation of anthranilate into the grape-like aroma compound methyl anthranilate (Pillet et al., 2017). Methyl anthranilate production has been long regarded as a complex trait, governed by multiple genes and strong environmental influences. Methyl anthranilate is produced abundantly in the fruit of the diploid strawberry sp. Fragaria vesca, but it is reported in only a few octoploid varieties including "Mara des Bois" and "Mieze Schindler" (Ulrich and Olbricht, 2016). For terpenoid biosynthesis, strawberry NEROLIDOL SYNTHASE 1 (NES1) was identified by comparing diploid and octoploid species, which are enriched respectively for nerolidol or linalool. A truncated plastid-targeted signal in the octoploid FanNES1 gene retargets the enzyme to the cytosol, where there is abundant precursor for linalool biosynthesis (Aharoni et al., 2004). Recent reports have complicated this story somewhat, as three octoploid F. virginiana lines, hexaploid F. moschata, and some diploid strawberries produce linalool without this truncation (Ulrich and Olbricht, 2013). In mesifurane biosynthesis, the gene O-METHYL TRANSFERASE 1 (FanOMT1) catalyzes the methylation of furaneol to create mesifurane (Wein et al., 2002;Zorrilla-Fontanesi et al., 2012). Mesifurane abundance is affected by a common FanOMT1 promoter loss-of-function allele, which both eliminates gene expression and mesifurane production. Only one copy of the competent FanOMT1 allele is reportedly sufficient for robust production; however, a lack of production is sometimes observed even in the homozygous positive state (Cruz-Rus et al., 2017). An octoploid gene encoding QUINONE REDUCTASE (FanQR) can produce furaneol in vitro; however, no natural variants of this gene have been established, which vary mesifurane levels in vivo (Raab et al., 2006). Similarly, glucosylation of both furaneol and mesifurane are known to occur in strawberry; however, genetic variation has not been established for this step. Several furaneol glucosyltransferases have been cloned and characterized in vitro from F. × ananassa (Song et al., 2016;Yamada et al., 2019).
This research integrates high-density genotyping and nontargeted fruit volatile metabolomics from eight pedigreeconnected octoploid crosses (n = 213) (Supplementary Figure 1). Fruit transcriptomes from a subset of individuals (n = 61) were used to identify fruit-expressed candidate genes within QTL regions (Supplementary Figure 1). Three maps were utilized independently in this analysis, as less than one-third of octoploid subgenome-specific markers are incorporated in any single octoploid genetic map (van Dijk et al., 2014;Anciro et al., 2018). These are the "Holiday" × "Korona" (van Dijk et al., 2014) and FL_08-10 × 12.115-10 (Verma et al., 2017) genetic maps, and the F. vesca physical map. The correspondence of "Holiday" × "Korona" linkage groups to the recent octoploid "Camarosa" reference genome  helped specify the subgenomic identity of QTL (Hardigan et al., 2020). To correspond QTL markers to specific candidate gene regions, marker nucleotide sequences from the IStraw35 SNP genotyping platform were aligned by sequence to the octoploid genome. However, the very high sequence identity between homoeologous chromosomes limited the specificity of this approach. Evidence from all of these resources were integrated to specify candidate gene regions.
Fruit transcriptomes were used to identify expressed genes within QTL regions and to associate trait/transcript levels. Genotypic data were associated with fruit transcriptomics data via expression-QTL (eQTL) analysis. Transcript eQTL analysis identifies genetic variants associated with heritable transcript level variation. Transcript eQTL often correspond to the locus of the originating gene (cis-eQTL), and often signify gene promoter mutation or gene presence/absence variation. In cases where a trait is governed by simple genetic control of transcript levels of a causal gene, an eQTL should be detected, which co-segregates with trait QTL markers. This approach can help specify the casual mechanisms behind trait QTL. Previous eQTL analyses in these same fruit RNA-seq populations identified hundreds of fruit eQTL, the vast majority of which were proximal to the originating gene locus . Global transcriptomes from tissues throughout the octoploid cultivar "Camarosa" were used to correlate candidate genes and transcript abundance with ripening-associated volatile biosynthesis .

Sample Processing and Preparation
Crushed frozen fruit samples were equilibrated from −80 • C to liquid nitrogen temperature before being pureed in an electric blender. Fine frozen puree was collected into a 50ml sterile collection tube and stored at −80 • C. For volatile sample processing, 3 g of frozen puree from three combined fruits per harvest timepoint were aliquoted into two technical replicate 20-ml headspace vials and combined with 3 ml of 35% NaCl solution containing 1 ppm of 3-hexanone as an internal standard. Prepared vials were stored at −80 • C, thawed at room temperature, and vortexed prior to GC-MS analysis.

Volatile Metabolomic Profiling and Analysis
Samples were equilibrated to 40 • C for 30 min in a 40 • C heated chamber. A 2-cm tri-phase SPME fiber (50/30 µm DVB/Carboxen/PDMS, Supelco, Bellefonte, PA, United States) was exposed to the headspace for 30 min at 40 • C for volatile collection and concentration. The fiber was then injected into an Agilent 6890 GC (for 5 min at 250 • C for desorption of volatiles. Inlet temperatures were maintained at 250 • C, ionizing sources at 230 • C, and transfer line temperatures at 280 • C. The separation was performed via DB5ms capillary column (60 m × 250 µm × 1.00 µm) (J&W, distributed by Agilent Technologies) at a constant flow (He: 1.5 ml per min). The initial oven temperature was maintained at 40 • C for 30 s, followed by a 4 • C per min increase to a final temperature of 230 • C, then to 260 • C at 100 • C per min, with a final hold time of 10 min. Data were collected using the Chemstation G1701 AA software (Hewlett-Packard, Palo Alto, CA, United States).
Chromatograms were processed using the Metalign metabolomics preprocessing software package (Lommen and Kools, 2012). Baseline and noise corrections were performed using a peak slope factor of 1 × noise, and a peak threshold factor of 2 × noise. Autoscaling and iterative pre-alignment options were not selected. A maximum shift of 100 scans before peak identification and 200 scans after peak identification was used. In later validation steps, these search tolerances were determined to be sufficiently inclusive while also limiting to false positives. The MSClust software package was then used for statistical clustering of ions based on retention time and co-variance across the population using default parameters (Tikunov et al., 2012). Clusters were batch queried against the NIST08 reference database using Chemstation G1701 AA software (Hewlett-Packard, Palo Alto, CA, United States). Library search outputs were parsed using a custom Perl script prior to multivariate analysis. Chromatograms were batched by GC-MS sampling year to mitigate ion misalignments caused by system-dependent retention time shifts. VOCs from different seasonal datasets were consolidated manually based on elution order, NIST identification, and rerunning of sample standards. VOC relative abundances between seasons were normalized based on the relative abundance of the 1-ppm 3-hexanone internal standard. Internal standard renormalization was not performed on within-season data as technical variation was low and the spike-in tended to introduce more variation than it resolved, which is a known issue in non-targeted analyses (Wehrens et al., 2016). All within-season technical and biological replicate VOC relative abundances were averaged.

Genotyping of Flavor and Aroma Populations
Individuals from all populations were genotyped using the IStraw90 (Bassil et al., 2015) platform, except populations 16.11 and 16.89, which were genotyped using the IStraw35 platform Lowercase letters indicate statistically significance mean differences at p < 0.05 (ANOVA). (C) Effect sizes for the methyl anthranilate QTL on Chr 5 and the putative Chr 7 QTL. (Verma et al., 2017). All parents and 204 progenies were selected for genotyping based on the segregation of desirable fruit volatiles. Sequence variants belonging to the poly high resolution (PHR) and no minor homozygote (NMH) marker classes were included for association mapping. Mono high resolution (MHR), off-target variant (OTV), call rate below threshold (CRBT), and other marker quality classes, were discarded and not used for mapping. Individual marker calls inconsistent with Mendelian inheritance from parental lines were removed.

Fruit Transcriptome Assembly and Analysis
Mature fruits from 61 parents and progeny from the biparental populations 10.113, 13.75, and 13.76 were sequenced via Illumina paired-end RNA-seq (average 65 million, 2 × 100-bp reads) and used for transcript eQTL analysis via the same samples and methodology reported for R-genes (Barbey et al., 2019) and other high-value fruit transcripts . Briefly, RNAseq reads were assembled based on the Fragaria × ananassa octoploid "Camarosa" annotated genome, with reads mapping equally well to multiple loci discarded from the analysis. Separately, raw RNA-seq reads from the "Camarosa" strawberry gene expression atlas study  were assembled via the same previously reported methodology and represent the average of three biological replicates. Transcript abundances were calculated in transcripts per million (TPM). Fruit eQTL analysis was performed using the mixed linear model method implemented in GAPIT v3 (Tang et al., 2016) as described in .

Genetic Association of Fruit Volatiles
Relative volatile abundance values were rescaled using the Box-Cox transformation algorithm (Box and Cox, 1964) performed in R (R. Development Core Team, 2014) using R-studio (Racine, 2011) prior to genetic analysis. GWAS on fruit volatiles was performed using the mixed linear model method implemented in GAPIT v3 (Tang et al., 2016) in R, using marker positions oriented to the F. vesca diploid physical map. Significantly associated volatiles were then reanalyzed in GAPIT using the "Holiday" × "Korona" and FL_08-10 × 12.115-10 genetic maps. Metabolomic associations were evaluated for significance based FIGURE 2 | Methyl ester and methyl anthranilate candidate genes. (A) The Chr 2 methyl ester QTL markers correspond to two homoeologous physical regions containing anthranilic acid methyl transferase-like (AAMT-like) genes on Chr 2-1 (top) and Chr 2-3 (bottom). (B) AAMT-like deduced proteins in the "Camarosa" genome are shown in a neighbor-joining cladogram, with transcript abundance heatmaps representing the highest TPM detected among the fruit transcriptomes. The Chr 2-1 and Chr 2-3 AAMT-like candidates are highly identical to the published AAMT-like "Camarosa" homolog (Chr 4-1 FanAAMT) and are highly abundant transcripts in the fruit.
on the presence of multiple co-locating markers of p-value < 0.05 after FDR multiple comparisons correction (Benjamini and Hochberg, 1995). Narrow-sense heritability (h 2 ) estimates were derived from GAPIT v3, while single-marker analysis was performed via ANOVA in R to investigate allelic effects.

Analysis of Candidate Genes
All gene models in the "Camarosa" genome were analyzed with the BLAST2GO pipeline and the Pfam protein domain database. Genes with significant homology to known volatile biosynthesis genes including FanOMT and FanAAMT were collected from the "Camarosa" genome using BLAST with inclusive criteria. This process was replicated for candidate genes including anthranilate synthase alpha subunit (FanAS-α) and others not presented in this analysis. Deduced protein sequences from transcripts were aligned using the slow progressive alignment algorithm in the CLC Genomics Workbench 11 (Gap Open cost = 10; Gap Extension = 1). Tree construction was performed using the neighbor joining method with Jukes-Cantor distance measuring with 1,000 bootstrapping replicates. Fruit transcript heatmaps were added to the cladogram to show the maximum transcript level detected among the 61 fruit transcriptomes. (B) The Chr 7 mesifurane QTL markers correspond to a region containing the published mesifurane FanOMT1 biosynthesis gene. (C) A cis-eQTL was detected for FanOMT1 (top), a gene known to be transcriptionally variable due to a common allele. The FanOMT1 eQTL co-segregates with mesifurane (bottom) but is confounded by additional factors. Lowercase letters indicate statistically significance mean differences at p < 0.05 (ANOVA). (D) Allelic combinations at the Chr 1 and Chr 7 markers demonstrate an epistatic effect between the two loci. At least one competent allele at each loci is required for robust production of mesifurane (AA/BB vs. AB/BB Tukey HSD p = 2.4e-7), while the double-homozygous allelic state produces statistically elevated mesifurane levels (BB/BB vs. AB/AB, AB/BB, and BB/AB ANOVA) [F(1,149) = 4.38, p = 0.038]. The Chr 1/Chr 7 genotypes in parental lines is shown with the n of each allelic category. Dotted lines represent the population means. Lowercase letters indicate statistical significance mean differences at p < 0.05 (ANOVA).
Genes putatively belonging to published volatile biosynthesis gene families were selected for fruit transcript eQTL analysis, using methods described previously (Barbey et al., 2019;. The 200 genes surrounding the most-correlated volatile QTL markers were also analyzed for eQTL and compared for co-segregation with volatile QTL.
A possible third methyl anthranilate signal on Chr 7 corresponds with the position of two ANTHRANILATE SYNTHASE ALPHA (FanAS-α) homoeologs (Supplementary Figure 5A and Supplementary Table 3). Both genes represent the only FanAS-α transcripts abundant in the fruit (Supplementary Figure 5B), and presence/absence variation of the Chr 7-4 FanAS-α transcript is governed by a cis-eQTL, which cosegregates with the methyl anthranilate signal (Supplementary  Figures 5C,D).
In diverse tissues of the "Camarosa" plant (AX-166520175 = BB), the candidate Chr 1-2 furaneol glucosyltransferase transcript levels are high in roots but low in the ripe fruit, with fruit expression somewhat increasing with ripening series (Supplementary Table 3). In the mature fruit RNA-seq populations, the Chr 1-2 candidate is modestly expressed in the fruit (5.3 ± 2.2 TPM).
The LG 3B terpene QTL IStraw35 probe sequences align non-specifically to all four Chr 3 homoeologs (Table 3). These corresponding genomic regions contain putative terpenoid biosynthesis gene clusters, which together contain three annotated copies of (3S,6E)-NEROLIDOL SYNTHASE, three copies of (E,E)-ALPHA-FARNESENE SYNTHASE, and   three copies of SOLANESYL DIPHOSPHATE SYNTHASE ( Table 3). The characterized FanNES1 deletion responsible for linalool biosynthesis in octoploids was not detected among the "Camarosa" FanNES1-like gene sequences; however, this gene appears on Chr 3-3 in the updated "Camarosa" v2 genome (Hardigan, personal communication).

DISCUSSION
Many QTLs were discovered for strawberry flavor and aroma compounds known to influence the human sensory experience. These QTLs are derived from eight biparental crosses phenotyped across multiple seasons under a commercial cultural system in central Florida and are likely to be useful for making genetic gains in related germplasm. Markers correlated with these traits may be used to guide breeding decisions and identify and select for alleles mediating flavor and aroma. Potential causal genes were identified via a multiomic approach, and provide a foundation for possible gene-editing-based approaches to improved strawberry flavor. These genetic discoveries represent new opportunities for improving flavor in commercial strawberry, and advance the basic understanding of the molecular mechanisms driving fruit flavor and aroma.

Methyl Anthranilate
Consistent with the long-standing polygenic hypothesis for methyl anthranilate production in octoploid strawberry, multiple QTLs were identified for this trait. Multiomic analysis of QTL regions implicated several likely causal genes within distinct QTL. Because many discrete loci affect methyl anthranilate levels, and because environmental interactions transiently induce wide phenotypic swings including trait presence/absence, the interactions between loci could not be reliably measured in this sample size and diverse set of crosses. However, no loci could be identified as singularly required for production. The published FanAAMT gene on Chr 4, which did not emerge as a QTL in this analysis, was identified solely in the context of the biparental population "Florida Elyana" × "Mara de Bois" (population "10.133"), which contained only 13 analyzed progeny from one cross (Pillet et al., 2017). It is possible that these differences are due to segregating genetic factors becoming fixed or lost in subsequent populations. This hypothesis is supported by the low positive rates resulting from F1 backcrosses to "Mara des Bois" (Chambers, 2013) and the fact that population "10.133" does not independently support the identified QTL regions. This QTL analysis is mostly comprised of populations using the parent "12.115-10, " which is a descendant of "Mara des Bois" that produces more methyl anthranilate than its ancestor. It is likely that this breeding line has been enriched for favorable methyl anthranilate genetics. These findings might relate more to quantitative differences in methyl anthranilate abundance, rather than the genetic presence/absence, which historically defines this rare trait among strawberry cultivars.
The methyl anthranilate LG2A QTL is positively correlated with the production of two other methyl ester volatiles, namely, methyl 2-hexenoate and methyl 2-methylbutyrate. Consistent with historical segregation ratios, which implicate methyl anthranilate as a polygenic trait, less methyl anthranilate variance is explained by this QTL compared with the other two methyl ester volatiles. As their precursors are not closely related, a single promiscuous methyl transferase offers a parsimonious explanation. In Pillet et al. (2017), moderate methyl anthranilate levels were occasionally detected in the near absence of the published FanAAMT transcript, which is suggestive of the possibility of additional methyl transferases.
While hundreds of methyl transferase genes exist in the octoploid genome, only the published FanAAMT has experimentally demonstrated affinity for anthranilate. Four FanAAMT-like transcripts were abundantly detected in mature octoploid fruit transcriptomes. Two of these expressed FanAAMT-like genes correspond to the QTL on Chr 2, located within two genes (Chr 2-1) and four genes (Chr 2-3) from the most-correlated QTL markers. The expressed Chr 4-1 AAMTlike sequence in the "Camarosa" genome is the most similar to the published Chr 4 AAMT sequence, whose subgenomic identity was not established (Pillet et al., 2017). This gene on Chr 4-1 might be the FanAAMT gene in "Camarosa, " particularly as RNA-seq reads from fruit transcriptomes have high sequence fidelity with this gene reference (Supplementary Figure 3).
Genetic mapping suggests only a single methyl anthranilate QTL for chromosome group 2, which should be located on "Camarosa" Chr 2-2. However, this QTL marker region in the Chr 2-2 physical sequence is completely absent. As "Camarosa" is not capable of producing methyl anthranilate, one or more required genetic elements are expected to be missing in this reference genome. Poor RNA-seq sequence agreement with the FanAAMT-like Chr2-1 and Chr 2-3 homoeologs suggests that the correct position of these transcript reads is not in the "Camarosa" genome. It is unlikely that RNA-seq reads corresponding to the published Chr 4 FanAAMT transcript would map falsely to the Chr 2 candidate loci, as the published sequence is the most identical to the Chr 4-1 FanAAMT gene, and the RNA-seq mapping criteria excludes all non-specific reads.
A comparative pan-genome analysis using a methyl anthranilateproducing individual would be highly informative and will be undertaken in the future.
No candidate genes belonging to the hypothesized methyl anthranilate pathway are located in the Chr 5-4 region of the "Camarosa" reference. However, a co-segregating transcript cis-eQTL was detected for a putative glutathione peroxidase gene. Many, but not all, significant markers were shared between the trait QTL and transcript cis-eQTL, since methyl anthranilate levels are influenced at multiple loci, while the candidate transcript is under strong single locus control. In microbes, there is precedent for heme peroxidase activity catalyzing methyl anthranilate biosynthesis (Van Haandel et al., 2000); however, this reaction is unlikely to proceed via a glutathione peroxidase. It is possible that this cis-eQTL is simply in close linkage with the actual causal gene, which was either not correctly identified or is not present in the "Camarosa" reference genome.
A possible third methyl anthranilate QTL corresponds with two Chr 7 ANTHRANILATE SYNTHASE ALPHA (FanASα) homoeologs. Presence/absence variation of the Chr 7-4 FanAS-α transcript is governed by a cis-eQTL, which cosegregates with the putative methyl anthranilate markers at this locus. The Chr 7-2 FanAS-α transcript also demonstrates transcript presence/absence variation, but this is apparently due mostly to non-heritable factors that are uncorrelated with Chr 7-4 transcript level variation. Although there are few methyl anthranilate-positive individuals among 61 fruit transcriptomes, none of the 10 individuals with zero combined FanAS-α expression shows methyl anthranilate production. This pathway mechanism is consistent with previous findings implicating FanAAMT as necessary-but-not-sufficient for methyl anthranilate production (Pillet et al., 2017). The absence of anthranilate substrate in the mature fruit would help explain the observed absence of methyl anthranilate production even when FanAAMT transcript levels are high. Further efforts to validate this potential QTL signal are underway.

Mesifurane
(2,5-dimethyl-4-methoxy-3(2H)-furanone or DMMF) is derived from the methylation of furaneol (4-hydroxy-2,5-dimethyl-3(2H)-furanone, or HDF) by FanOMT1 (Wein et al., 2002;Zorrilla-Fontanesi et al., 2012). Mesifurane variance is influenced by a loss-of-function mutation in the FanOMT1 promoter, which eliminates transcription and mesifurane production. This model was validated by the detection of a cis-eQTL for the published FanOMT1 gene, which co-segregates with the Chr 7B mesifurane trait QTL. A novel mesifurane QTL was detected on Chr 1A, which is in epistasis with the Chr 7B QTL. This QTL region contains a fruit-expressed furaneol glucosyl transferase, which is 95% identical to a characterized furaneol glucosyl transferase from F. × ananassa. A substrate-restricting glucosyltransferase candidate is consistent with the epistatic interaction detected with the FanOMT1 locus. Depletion of substrate via glucosylation would limit mesifurane biosynthesis regardless of high FanOMT1 transcript levels. Conversely, elimination of the FanOMT1 transcript would eliminate mesifurane production regardless of substrate availability. The LG 1A mesifurane QTL was subsequently confirmed using two validation populations, providing robust support for this QTL. This two-gene model for mesifurane biosynthesis in cultivated strawberry can be exploited for genetic gain via marker-assisted selection. Moderate mesifurane levels can be maintained via dual selection for heterozygous/heterozygous allelic states, and somewhat elevated mesifurane levels can be achieved via doublehomozygote selection. These findings may resolve some of the outstanding questions in mesifurane genetics posed by Cruz-Rus et al. (2017).

Terpenes
Homoeologous terpene gene arrays were detected for nine strawberry mono-and sesquiterpene QTL, including the desirable compound linalool. In citrus, monoterpenes and sesquiterpenes co-locate to single genomic QTL containing paralogous terpene synthases (Yu et al., 2017). We identify a similar phenomenon in cultivated strawberry. This terpene hotspot contains clusters of multiple terpenoid synthase classes, in addition to homoeologous genes on three of four subgenomes. The known biosynthesis gene FanNES1 was associated with terpene levels via trait/transcript level correlations and trait QTL/eQTL co-segregation. Solanesyl diphosphate synthase may contribute to terpene abundances as well. The cis-eQTL/QTL genetic association with solanesyl diphosphate synthase on Chr 3-3 helps support the subgenomic location of FanNES1 and the shared terpenoid QTL in the "Camarosa" genome, despite only two markers being genetically mapped and probe nucleotide sequences aligning to multiple subgenomes. It is possible that the influence of other terpene-related genes in this array remains undetected due to limitations in genome completeness, marker subgenome ambiguity, and/or presenceabsence variation among genomes. With additional octoploid strawberry genomes for comparison and improved subgenomic genotyping tools, complex associations in octoploid strawberry will become more robust.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/, SRP039356; https://www.ebi.ac.uk/ena, PRJEB12420.

AUTHOR CONTRIBUTIONS
CB, SV, KF, and VM contributed to study conception and design. CB and SV performed QTL analysis. CB, MH, BH, AS prepared DNA, RNA, and GC/MS samples, performed GC/MS data analysis, eQTL analysis, and eQTL/QTL co-localization analysis. SL and YO designed and executed the mesifurane marker validation experiment via HRM. CB wrote the manuscript and all authors contributed substantially to the editing process.

ACKNOWLEDGMENTS
We gratefully acknowledge Aristotle Koukoulidis for the genomic DNA isolation, Natalia Salinas for the assistance with GC/MS sample collection and preparation, Denise Tieman for the assistance with GC/MS operation, Nadia Mourad for the assistance compiling eQTL results, Andrew Hanson and Harry Klee for the project discussion and guidance, Alan Chambers and Jeremy Pillet for the RNA isolation and RNA-seq line selection, Zhen Fan for the manuscript editing and discussion, and Angelita Arredondo and Kelsey Cearley for the field expertise and assistance with fruit collection. This manuscript has been released as a pre-print at bioRxiv (Barbey C.R. et al., 2020). The methyl anthranilate Qtl on chromosome 5-4 (Lg 5A) is shared with a cis-eQtl for a putative glutathione peroxidase transcript. (B) The range of both methyl anthranilate (r 2 = 0.181, p = 1.5e−5) abundance and Glutathione Peroxidase (r 2 = 0.511, p = 2.8e−9) transcript abundance is shown for the shared marker Ax-123358624.
Supplementary Figure 5 | Methyl anthranilate Chr7 candidate genes. (A) The putative Chr 7 methyl anthranilate signal corresponds to two homoeologous regions containing anthranilate synthase genes on Chr 7-2 (top) and Chr 7-4 (bottom). (B) Anthranilate synthase-like deduced proteins in the 'Camarosa' genome are shown in a neighbor-joining cladogram, with transcript abundance heatmaps representing the highest Tpm detected among the fruit transcriptomes. The anthranilate synthase alpha subunit candidate genes on Chr 7-2 and Chr 7-4 are highly abundant in the fruit, as are two corresponding beta subunits gene. (C) Variable transcript levels of one anthranilate synthase candidate (Chr 7-4) are governed by a transcript cis-eQtl. (D) The eQtl for the anthranilate synthase alpha candidate, which governs transcript presence/absence in the fruit (left), also co-segregates with the methyl anthranilate Chr 7 putative Qtl (righ).
Supplementary Figure 6 | High-resolution melting (Hrm) curves for two Chr 1 mesifurane Qtl markers. Ten individuals were initially confirmed to be either homozygous negative (red), heterozygous (green), or homozygous positive (blue) for the markers (A) Ax-166520175 and (B) Ax-166502845 based on melting curve properties. (C) Anova test statistics of fruit mesifurane abundance levels among 72 additional individuals tested by Hrm confirm the Chr 1 mesifurane Qtl markers.