Construction of ddRADseq-Based High-Density Genetic Map and Identification of Quantitative Trait Loci for Trans-resveratrol Content in Peanut Seeds

Resveratrol (trans-3,4′,5-trihydroxystilbene) is a natural stilbene phytoalexin which is also found to be good for human health. Cultivated peanut (Arachis hypogaea L.), a worldwide important legume crop, is one of the few sources of human's dietary intake of resveratrol. Although the variations of resveratrol contents among peanut varieties were observed, the variations across environments and its underlying genetic basis were poorly investigated. In this study, the resveratrol content in seeds of a recombination inbred line (RIL) population (Zhonghua 6 × Xuhua 13, 186 progenies) were quantified by high performance liquid chromatography (HPLC) method across four environments. Genotypes, environments and genotype × environment interactions significantly influenced the resveratrol contents in the RIL population. A total of 8,114 high-quality single nucleotide polymorphisms (SNPs) were identified based on double-digest restriction-site-associated DNA sequencing (ddRADseq) reads. These SNPs were clustered into bins using a reference-based method, which facilitated the construction of high-density genetic map (2,183 loci with a total length of 2,063.55 cM) and the discovery of several chromosome translocations. Through composite interval mapping (CIM), nine additive quantitative trait loci (QTL) for resveratrol contents were identified on chromosomes A01, A07, A08, B04, B05, B06, B07, and B10 with 5.07–8.19% phenotypic variations explained (PVE). Putative genes within their confidential intervals might play roles in diverse primary and secondary metabolic processes. These results laid a foundation for the further genetic dissection of resveratrol content as well as the breeding and production of high-resveratrol peanuts.


INTRODUCTION
Resveratrol is a non-flavonoid polyphenol compound synthesized and functioned as phytoalexinin more than 70 plant species (Pastor et al., 2019). Natural resveratrol exists mainly in the trans-isoform (Novelle et al., 2015) and has been identified in only a few of edible plants such as grape, peanut, and berries (Pastor et al., 2019;Singh et al., 2019). Resveratrol is synthesized by stilbene synthase (STS) via the phenylalanine/polymalonate biosynthetic pathway (Shomura et al., 2005). The dietary intake of resveratrol from moderate red wine was reported to be partly contributed to the "French paradox" in the early 1990s, and studies have been conducted to increase the contents of resveratrol in edible plants (e.g., grapes) which could be produced to health foods (Hasan and Bae, 2017). In addition, the favorable effects of resveratrol for human health have been reported in the therapeutic outcomes in many preclinical and clinical trials of cardiovascular diseases, cancer and other important human diseases, although inconsistent results were observed in other studies (Fogacci et al., 2019;Singh et al., 2019;Wang et al., 2019b).
As one of the dietary sources of resveratrol (Delaunois et al., 2009), the cultivated peanut (Arachis hypogaea L.) is an important legume crop grown in more than 100 countries (FAOSTAT, 2019). Its global productions were around 47 million tones in recent years (FAOSTAT, 2019), and its seeds are consumed worldwide in diverse forms, such as nuts, peanut butter, edible oil, and candies (Varshney et al., 2013). The resveratrol in peanut mainly exists in the trans-form, and the HPLC method has been developed to quantify its content. Significant variations were observed for resveratrol contents among different peanut varieties (Sanders et al., 2000). Using the US mini-core collection as materials, Wang et al. (2013) found that there was a ninefold difference (30-260 µg/kg) of resveratrol content in seeds across different peanut accessions. However, this study was conducted in single environment, and the variation of resveratrol content across environments was not evaluated. In addition, the genetic basis for these variations was poorly investigated in peanut as well as other plant species producing resveratrol. Therefore, more efforts should be made in genetic studies to provide the theoretical basis and technique support for breeding and production of novel high resveratrol varieties.
QTL mapping using bi-parental genetic population has become a routine approach to dissect the genetic basis for quantitative traits, such as yield-related traits, resistance to diseases, tolerance to drought, oil content and fatty acid compositions (Varshney et al., 2009;Vishwakarma et al., 2017). However, the construction of high-quality and high-density genetic map is the prerequisite for the discovery of genomic-wide QTL. With the reducing of high-throughput sequencing cost, the SNP polymorphisms have been utilized as markers to construct high-density genetic maps (Pandey et al., 2016). Through sequencing based genotyping methods such as ddRADseq (Zhou et al., 2014), GBS (Dodia et al., 2019) and SLAF-seq (Hu et al., 2018), SNP-based genetic maps could be constructed in much shorter time than traditional gel-based genotyping and have lower cost than whole-genome resequencing (Agarwal et al., 2018). However, the previously reported high-density genetic maps were developed though de-novo method in peanut (Zhou et al., 2014;Hu et al., 2018;Dodia et al., 2019). Along with the availability of peanut genome sequences (Bertioli et al., 2019;Chen et al., 2019a;Zhuang et al., 2019), it is feasible to construct the high-density genetic map through the referencebased method approach which might benefit in making the best use of sequencing reads and discovering chromosome variations.
In the present study, the resveratrol content in peanut seeds of a RIL population were quantified across four environments. The effects of genotype, environments and genotype × environment interactions on the variation of resveratrol contents in the RIL population were analyzed. Using the recent published peanut genome sequence as reference, a high-density and highquality genetic map was constructed through the reference-based analysis, upon which nine additive QTL were identified to be associated with resveratrol content in peanut seeds. Functions of the putative genes located in the confidential intervals of the identified QTL were investigated.

Plant Materials
The RIL population derived from the cross between Xuhua 13 and Zhonghua 6  was used as plant materials in the present study. Recently, ddRADseq reads were generated for 186 RILs in F 6 generation of the RIL population (Liu et al., 2020). Four generations (F 6 to F 9 ) of the RIL population were used for phenotyping resveratrol contents, and they were planted using randomized complete block design with three replications in Wuchang in 2015 (F 6 generation) and 2016 (F 7 generation), Yangluo in 2017 (F 8 generation), and Xiangyang in 2018 (F 9 generation). These trials were designated as four environments: WC2015, WC2016, YL2017, and XY2018, respectively. Matured pods were harvested, air-dried, stored at 4 • C until shelled by hand before phenotyping.

Phenotyping of Resveratrol Contents by HPLC
The standard trans-resveratrol (CAS 501-36-0) was purchased from Sigma (St Louis, MO, USA) and used to generate the standard curve. The resveratrol contents in seeds of the RIL population were quantified according to the reported HPLC method (Wang et al., 2013). Briefly, 10 g of peanut seeds were ground by a coffee blender. Then, 5 ± 0.0001 g fine powder, which passed through a 20 mesh sieve, was used for extraction of resveratrol in a 50-ml centrifuge tube. The tube, into which 40 ml ethanol (85%) was added, was placed in 80 • C water bath for 1 h and then centrifuged at 10,000 rpm for 10 min. Subsequently, 8 ml supernatant was filtered through a C18/AL-N SPE column (Agela Technologies). The filtrate was blow-dried with nitrogen gas and dissolved with 1 ml methanol. The sample was finally filtered with a 0.22 µm filter and quantified using the Agilent 1290 system with the C18 column (4.6 mm × 150 mm, 5 mm; Agilent Technologies, USA). The column temperature was set as 30 • C. The mobile phase was consisted of (A) ultrapure water with 0.05% acetic acid and (B) HPLC-grade acetonitrile. The chromatogram was recorded at 306 nm and the flow rate was 0.9 ml/min. After 10 µl samples were injected, a 12 min elution was performed with 78% A and 22% B. After that, the column wash was washed for 2 min with 60% A and 40% B to get ready for next injection. The phenotypic distribution of resveratrol contents in the RIL population was plotted by the "ggplot2" R package.

SNP Calling and Genotyping
The ddRADseq reads of the RIL population (BioProject: PRJNA520741) (Liu et al., 2020) were re-analyzed following the pipeline of reference-based analysis (Rochette and Catchen, 2017) of the Stacks software (Catchen et al., 2013) to identify SNPs. Briefly, low-quality reads were removed by the process_radtags unit of Stacks. Then, high-quality reads from each RIL were mapped to the recently published genome sequences (the KYV3 version) of cultivated peanut cv. Tifrunner (Bertioli et al., 2019) by the BWA software (Li and Durbin, 2009). The gstacks unit of Stacks was used to build loci from uniquely mapped reads. The populations unit was used to remove loci shared by <80% of samples and output site level SNP calls in VCF format. These SNPs were transformed into genotypes using an in-house Perl script. Individual SNPs were recorded as "A" representing homologous for Zhonghua 6, "B" representing homologous for Xuhua 13, and "X" representing heterozygous or missing alleles). In order to conveniently retrieve physical positions, SNPs was designated with initial letters "TIF" (representing Tifrunner), followed by the corresponding chromosome and position. For example, TIF.01:7840610 was the SNP identified at position 7,840,610 bp on chromosome 1 (A01) of the Tifrunner reference genome.

Construction of High-Density Genetic Map
The identified high-quality SNPs were clustered into bins using a reference-based method, i.e., the SNPs with identical genotypes and alongside each other on the reference genome were clustered as one bin, and the SNP with the least missing genotypes was selected to represent the bin. The genetic map was constructed based on these bins using the QTL IciMapping software (Meng et al., 2015). Pearson's Chi square test was performed to test the goodness of fit to the expected segregation ratio 1:1 for each locus (P < 0.01).

Identification of QTL for Resveratrol Content
Additive QTL were identified by the composite interval mapping (CIM) method of the WinQTLCart software (Wang et al., 2012) with default parameters for each environment. The BLUP values for resveratrol contents across the four environments were calculated using the "lme4" R package and used to identify QTL as well. The identified QTL were designated with initial letters "q," followed by trait name (RES for resveratrol content) and the corresponding chromosome. A number were added if two or more QTL were identified on the same chromosome. For example, qRESB07.1and qRESB07.2 were the first and second QTL identified on chromosome B07, respectively. The putative genes within the 2 LOD confidential interval of identified QTL were retrieved from the genomic annotation of the peanut accession Tifrunner (Bertioli et al., 2019). The annotations of the identified genes were obtained from the Tifrunner annotation file. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the putative genes were conducted using the Blast2GO suite (Gotz et al., 2008).

Phenotypic Variation of Resveratrol Contents in the RIL Population
The resveratrol content of the variety Xuhua 13 (the female parent) was consistently lower than that of the variety Zhonghua 6 (the male parent) across the four environments including WC2015, WC2016, YL2017 and XY2018 (Figure 1, Table 1). The resveratrol contents of the 186 RILs varied from 37.33 to 270.00 µg/kg in WC2015, from 3.61 to 258.79 µg/kg in WC2016, from 8.96 to 282.89 µg/kg in YL2017, and from 13.60 to 215.06 µg/kg in XY2018 ( Table 1). As shown in Figure 1 and Table 1, the distribution of resveratrol contents in the RIL population was continuous and with transgressive segregation. According to the variance analysis across the four environments, genotypes, environments and genotype × environment interactions significantly influenced resveratrol contents among RILs (Supplementary Table 1, P < 0.001). The broad sense heritability of resveratrol content in the present study was estimated to be 0.33, indicating that environments had significant influence in resveratrol contents of peanut seeds.

Genotyping of the RIL Population
The 359.96 Gb ddRADseq clean data (3,712.28 M reads) of the parents and RIL population were analyzed using the recently reported genome sequences of Tifrunner as reference. A total of 2,616,605 loci were built following the pipeline of reference-based analysis of the stacks software (Figure 2

Construction of High-Density Genetic Map
The above 2,191 bins were used to construct genetic map, and the final genetic map was consisted of 2,183 genetic loci (   (41), followed by A05 (29)

Evaluation of Genome Synteny of the High-Density Genetic Map
A total of 2,099 SNP loci (accounting for 96.15%) of the genetic map were located on same chromosomes of the reference genome of Tifrunner (Bertioli et al., 2019), and good collinearity was clearly observed for almost all chromosomes (Figure 4,  Supplementary Figure 2). BLASTn analysis of the RAD-tags of the other 84 SNP loci against to genome sequences of cultivated peanut Tifrunner, Shitouqi (Zhuang et al., 2019), and Fuhuasheng (Chen et al., 2019a) as well as wild species (Bertioli et al., 2016) found that translocations occurred in different peanut varieties (Supplementary Table 3). Obviously, reciprocal translocations occurred between chromosome A03 and B03 as well as A06 and B06 in Tifrunner but not in Shitouqi, Fuhuasheng and the parents of the RIL population. Interestingly, a fragment from B10 was translocated to B03 in Tifrunner and Shitouqi but not in Fuhuasheng.

QTL Associated With Resveratrol Content
Based on the high-density genetic map, three, two, and two QTL for resveratrol content were identified the phenotyping data of the WC2016, WC2017, and XY2018 environments, respectively, while no QTL was identified in the YL2017 environment. In addition, three QTL were identified using the BLUP values across the four environments ( Table 3). The qRESB04 could be repeatly identified using the phenotyping data of the WC2016 environment as well as the BLUP values across four environments. Therefore, a total of nine QTL were identified across four environments (Table 3), and were located on eight chromosomes, namely A01, A07, A08, B04, B05, B06, B07, and B10 ( Figure 5), with 5.07-8.19% phenotypic variation explained (PVE). The confidential intervals of the identified QTL varied from 2.0 cM for qRESB05 to 30.9 cM for qRESA07 and averaged 10.64 cM. The physical intervals of the identified QTL in the reference genomes of Tifrunner were conveniently estimated according to the names of flanking markers. The qRESB05 and qRESB10 were located in lower recombination regions on chromosome B05 and B10, and their physical intervals (∼18.53 Mb and ∼22.32 Mb, respectively) were much larger than others (0.69-6.68 Mb). A total of 2,429 putative genes were retrieved within physical intervals of these QTL from the genome annotation of Tifrunner, ranging from 43 genes for qRESB06 to 544 genes for qRESA07 (Table 3, Supplementary Table 4). As shown in Supplementary Table 5, GO terms were retrieved for 1,410 genes from the Tifrunner genome annotation file. In terms of cellular components, cell (14.8%) was the most frequent GO term, followed by membrane (13.6%), organelle (9.6%), and protein-containing complex (6.7%) (Supplementary Figure 3). For molecular functions, binding (63.2%) was the most frequent GO term and followed by catalytic activity (47.4%) (Supplementary Figure 3). For biological processes, metabolic process (44.4%) was the most frequent GO term, followed by cellular process (35.0%), biological regulation (10.4%), and localization (9.5%) (Supplementary Figure 3). Based on the KEGG analysis, 94 enzymes that might be encoded by 161 genes were assigned to 81 KEGG pathways of various primary and secondary metabolisms (Supplementary Table 6).

DISCUSSION
Peanut is one of the few edible plants for human dietary intake of resveratrol. Sanders et al. (2000) reported that the resveratrol contents of 15 peanut varieties were ∼196.47 µg/kg. Wang et al. (2013) detected the seeds of 102 accessions within the U.S. peanut mini-core collection and found that the resveratrol contents varied from 30 to 260 µg/kg and averaged at 100 µg/kg. In the present study, the resveratrol content in seeds of the RIL population showed similar variation range (i.e. 3.61-282.69 µg/kg). In addition, significant variations across environments were observed, including locations and years ( Table 1 and Supplementary Table 1), which is consistent with the reports that varieties, geographical indications and vintage years significantly influenced the resveratrol contents in red wines (Nikfardjam et al., 2006;Pastor et al., 2019). Moreover, the broad sense heritability of resveratrol content across environments was relatively low (0.33). Therefore, attentions should be paid to not only varieties but also producing geographical locations as well as years in the production of high resveratrol peanut, and the breeders should specify their breeding programs for different target producing locations. The availability of multiple peanut reference genomes facilitated the construction and evaluation of high-density genetic map as well as the discovery of chromosome variations. Previous studies constructed high-density genetic map through the de-novo approach of the stacks software (Zhou et al., 2014;Wang et al., 2018;Liu et al., 2020). However, majority of the linkage map construction software are not affordable for the large number of SNPs. The SNPs were reduced randomly or with very stringent filters (Zhou et al., 2014;Wang et al., 2018;Liu et al., 2020), which might cause the loss of useful SNPs. In the present study, using the recently reported cultivate peanut genome assembly (Bertioli et al., 2019) as reference, the previously reported ddRADseq reads (Liu et al., 2020) were re-analyzed through the reference-based pipeline of stacks. The identified SNPs were binned according to not only their genotypes in the RIL population but also physical positions on the reference genome, which effectively removed redundant SNPs and helped in the construction of high-density genetic map (Figures 2, 3). The constructed genetic map was estimated to cover 96.62% of the reference genome, and its loci number and density reached a fairly high level when compared to previous studies (Ravi et al., 2011;Qin et al., 2012;Huang et al., 2015;Chen et al., 2016Chen et al., , 2017Chen et al., , 2019bWang et al., 2019a). Moreover, good collinearity between the constructed genetic map with the physical map was revealed by comparing to multiple peanut reference genomes (Figure 4 and Supplementary Table 3). Notably, translocations between chromosomes, such as A03 and B03, A06 and B06, B10, and B03, were observed in present study, which was ignored in the previous study using the de-novo approach (Liu et al., 2020).
The resveratrol content is a complicated trait and controlled by multiple QTL from both subgenomes of peanut. In the present study, a total of nine additive QTL (5.07-8.19% PVE) for resveratrol content were identified on three chromosomes (A01, A07, and A08) of the A subgenome and five chromosomes (B04, B05, B06, B07, and B10) of the B subgenome (Table 3, Figure 5). The top five RILs had higher resveratrol contents than parents, and they possess the favored alleles of 4-7 identified QTL from both parents (Supplementary Table 7). Only one QTL, the qRESB04, was identified in the WC2016 environment as well as using the BLUP values across four environments, while the other nine QTL were identified only in single environment. In addition, no QTL were identified in the YL2017 environment, in which majority of RILs skewed obviously toward low resverartrol contents (Figure 1). These phenomena were consistent with the fact that environments and QTL × environment interactions had significant influences in peanut resveratrol content. A pool of 2,429 putative genes were retrieved from the corresponding physical intervals of these QTL in the Tifrunner genome (Bertioli et al., 2019), and many of them involved in diverse primary and secondary metabolic pathways. However, target genes influencing resveratrol content could not be predicted in the present study. Further studies, e.g., transcriptome/metabolome analysis, might provide more evidence to identify key metabolites and candidate genes associated with resveratrol synthesis in peanut. Moreover, the peanut synthase of resveratrol (STS) (Shomura et al., 2005) might be located at ∼13 Mb on chromosome B04 or ∼11 Mb on A04, which were outside of the identified QTL. Therefore, resveratrol, as a secondary metabolite, might be synthesized by STS under the influence of various upstream biological processes, and it would be a challenge to breed high resveratrol peanut varieties.
In conclusion, the present study constructed a high quality genetic map and identified nine QTL with main effects for resveratrol content using a RIL population across four environments. These QTL lay a foundation for the future marker assisted selection (MAS) in the improvement of resveratrol FIGURE 5 | The genetic map locations of QTL identified for resveratrol contents. QTL and their 1 LOD confidential intervals (filled boxes) and 2 LOD confidential intervals (lines) were colored according to environments as shown in figure legend. The purple color was used to highlight the 2 LOD confidential intervals on chromosome bars. Bins within the 2 LOD confidential intervals were highlighted in purple color as well.
Frontiers in Plant Science | www.frontiersin.org content. In future studies, a wide range screening of germplasms in multiple producing environments should be conducted to determine the range of resveratrol contents in peanut seeds and identified elite accessions for genetic studies and breeding utilization. In addition, more studies could be conducted to evaluate the resveratrol contents, variations as well as their relationships in different peanut tissues, which might help developing diverse high resveratrol peanut products. Along with the reducing cost of high-throughput sequencing or genotyping, genome-based trait prediction models (Pandey et al., 2020) for the resveratrol content could be build and might be used in genomic selection (Crossa et al., 2017) to breed high resveratrol varieties.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
HL, YL, BL, and HJ conceived, designed, and supervised the experiments. XR, LY, DH, YL, BL, and HJ developed the RIL population. JG, BY, WC, HZ, XZ, YC, NL, and LH conducted field trials and phenotyping of resveratrol contents by HPLC. HL performed the construction of high-density genetic map from sequencing data and interpreted the results. HL prepared the first draft and YL, BL, and HJ contributed to the final editing of manuscript. All authors read and approved the final manuscript.

FUNDING
This project was supported by the National Natural Science Foundations of China (Grant Numbers 31971903, 31601340, and 31761143005) and the Central Public-interest Scientific Institution Basal Research Fund (Grant Number 1610172019008). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.