Genotyping-by-Sequencing Based Genetic Mapping Identified Major and Consistent Genomic Regions for Productivity and Quality Traits in Peanut

With an objective of identifying the genomic regions for productivity and quality traits in peanut, a recombinant inbred line (RIL) population developed from an elite variety, TMV 2 and its ethyl methane sulfonate (EMS)-derived mutant was phenotyped over six seasons and genotyped with genotyping-by-sequencing (GBS), Arachis hypogaea transposable element (AhTE) and simple sequence repeats (SSR) markers. The genetic map with 700 markers spanning 2,438.1 cM was employed for quantitative trait loci (QTL) analysis which identified a total of 47 main-effect QTLs for the productivity and oil quality traits with the phenotypic variance explained (PVE) of 10–52% over the seasons. A common QTL region (46.7–50.1 cM) on Ah02 was identified for the multiple traits, such as a number of pods per plant (NPPP), pod weight per plant (PWPP), shelling percentage (SP), and test weight (TW). Similarly, a QTL (7.1–18.0 cM) on Ah16 was identified for both SP and protein content (PC). Epistatic QTL (epiQTL) analysis revealed intra- and inter-chromosomal interactions for the main-effect QTLs and other genomic regions governing these productivity traits. The markers identified by a single marker analysis (SMA) mapped to the QTL regions for most of the traits. Among the five potential candidate genes identified for PC, SP and oil quality, two genes (Arahy.7A57YA and Arahy.CH9B83) were affected by AhMITE1 transposition, and three genes (Arahy.J5SZ1I, Arahy.MZJT69, and Arahy.X7PJ8H) involved functional single nucleotide polymorphisms (SNPs). With major and consistent effects, the genomic regions, candidate genes, and the associated markers identified in this study would provide an opportunity for gene cloning and genomics-assisted breeding for increasing the productivity and enhancing the quality of peanut.


INTRODUCTION
Peanut or groundnut (Arachis hypogaea L. 2n = 4x = 40) is an important oilseed, legume food, and fodder crop, which, in 2019, was cultivated globally on an area of 29.5 million ha with a production of 48.7 million tons and a productivity of 1,647 kg/ha (http://www.fao.org/faostat/en/#data/QC/visualize). Globally, over half of the peanut produce goes for oil extraction while the remaining is consumed as raw and processed food. In India, over 80% of the produce was used for oil extraction in the past. But, now, it has reduced to <50% (Sharma, 2017), indicating a shift in the use of peanut in multiple food preparations. Apart from being rich in oil, proteins, fibers, polyphenols, antioxidants, vitamins, and minerals, the peanut is an excellent source of compounds, such as resveratrol, phenolic acids, flavonoids, and phytosterols, co-enzyme Q10, and amino acids (all 20, with the highest content of arginine). Peanut forms a major food component in fighting malnutrition in the form of Ready-touse Therapeutic Food (RUTF) in Africa and Asia. With these nutrient profiles, peanut is being considered a functional food (Arya et al., 2016). Thus, peanut has gained the status of "poor person's almond" over the years. However, kernel features and nutritional qualities need to be considered while attempting to increase peanut productivity along with tolerance to the biotic and abiotic stresses.
The last decade has been transformational for peanut stakeholders globally because of tremendous developments in the availability of substantial genomic resources and optimization of multiple modern breeding approaches, such as marker-assisted selection (MAS), genomic selection, and rapid generation advancements (as shown in Pandey et al., 2020). The availability of high-quality reference genomes for diploid subgenomes (Bertioli et al., 2016;Chen et al., 2016;Lu et al., 2018), primitive tetraploid (Yin et al., 2018) as well as the subspecies of the cultivated tetraploid peanut (Bertioli et al., 2019;Chen et al., 2019;Zhuang et al., 2019), high density genotyping assay with 58K single nucleotide polymorphisms (SNPs) , genotyping-by-sequencing (GBS) (Dodia et al., 2019;Wang et al., 2019Wang et al., , 2021Zhou et al., 2021), and other reducedrepresentation sequencing (Zhao et al., 2016;Shirasawa et al., 2018;Luo et al., 2021) based genotyping in peanut provided a strong platform for precise trait mapping, gene discovery, and marker development for use in breeding (Han et al., 2018;Wang et al., 2019). With the availability of trait-specific markers, peanut has already demonstrated the application of marker-assisted breeding by developing several new varieties with improved disease resistance and oil quality (as shown in Pandey et al., 2020). However, the challenge still prevails for molecular breeding to improve the productivity traits that show complex genetic inheritance. Therefore, such traits need multi-environment phenotyping and dense genotyping data for performing high-resolution genetic mapping and the precise detection of genetic factors with direct and epistatic effects over the seasons.
The recombinant inbred line (RIL) population (Pattanashetti, 2005) derived from an elite peanut variety TMV 2 and its ethyl methane sulfonate (EMS)-induced mutant TMV 2-NLM (Prasad et al., 1984) allowed subtracting a large portion of the genome common between the parents, thereby favoring successful trait mapping as demonstrated with 105 AhTE markers in our previous effort (Hake et al., 2017). Therefore, this study aimed to enrich the linkage map with GBSbased SNP markers along with AhTE and simple sequence repeat (SSR) markers and generating the phenotypic data over six seasons to detect the genomic regions with main and epistatic effects in addition to identifying a few cosegregating genes.

Plant Material
We used a RIL population developed (Pattanashetti, 2005) from the cross between TMV 2, an elite variety of peanut and its EMS-mutagenized derivative TMV 2-NLM (Prasad et al., 1984). TMV 2 is a Spanish bunch cultivar known for its uniform pods and kernels, kernel taste, and wide adaptability (Rathnakumar et al., 2013), but low in OLE (42.08%). TMV 2-NLM is a semi-spreading cultivar with bold kernels, low yield, and moderate content of oleic acid (53.73%) (Prasad et al., 1984). The phenotypic and genotypic differences between TMV 2 and TMV 2-NLM have been previously reported by Hake et al. (2017).  (Figure 1). During each season, the RILs were grown in two replications with a spacing of 30 × 10 cm with recommended agronomic practices. The observations were recorded on productivity traits, such as the number of pods per plant (NPPP), pod weight per plant (PWPP), shelling percentage (SP), and test weight (TW), and on quality traits, such as protein content (PC), oil content (OIL), OLE, linoleic acid content (LIN), and oleic to linoleic acid ratio (O/L). The quality parameters were estimated using the near-infrared reflectance spectroscopy (NIRS) (Model XDS RCA, FOSS Analytical AB, Sweden, Denmark) at ICRISAT, Patancheru, India.

Phenotyping of the Mapping Population and Statistical Analysis
An ANOVA was performed for each trait observed during each season to test the significant differences among the RILs. A pooled analysis of variance was performed for all the traits across the seasons allowing G × E interactions. Phenotypic coefficient of variation (PCV), genotypic coefficient of variation (GCV), and broad sense heritability (h 2 b.s) were estimated using the plant breeding package Windostat ver. 8.5 (Indostat Services, Hyderabad, India, https://www.indostat.org/ agriculture.html). Pearson's correlation coefficients (r) among the different traits were estimated over the seasons using the 16th version of SPSS (SPSS Inc., Chicago, IL, USA). FIGURE 1 | Flow chart of genotyping, high density genetic map construction, multi-season phenotyping, identification of genomic regions for productivity, and quality traits and their validation.

DNA Extraction and Genotyping With AhTE and SSR Markers
DNA was isolated from the young leaves of each RIL and the parents following the modified cetyltrimethyl-ammonium bromide (CTAB) method as described by Cuc et al. (2008). DNA quality of each sample was checked on 0.8% agarose gel. Furthermore, DNA quantification was done using Nano-Drop (UV technologies, Wilmington, DE, USA), and DNA concentration was normalized to ∼5-10 ng/µl for genotyping the parents and the RIL population using AhTE and SSR markers. In total, 343 AhTE markers (Gayathri et al., 2018) and 91 SSR markers (as shown in Pandey et al., 2012) were screened for parental polymorphism between TMV 2 and TMV 2-NLM. Subsequently, the markers polymorphic between the parents were identified and used to genotype the RILs (Figure 1). PCR and separation of the amplicons and scoring of the alleles were performed as described by Kolekar et al. (2016). Genotypic data on 105 AhTE markers generated by Hake et al. (2017) on these RILs were also employed for genetic mapping.

GBS of the RILs, Sequence Analysis and SNP Calling
Genotyping-by-sequencing was performed for the RILs and their parents as described by Dodia et al. (2019). To perform GBS, 10 ng DNA from each RIL was digested using the restriction endonuclease enzyme ApeKI that recognizes the site G/CWCG. The ligation enzyme, T4 ligase, was used to ligate the digested products with uniquely barcoded adapters. Such digestion and ligation were performed for each RIL, and an equal proportion of the products from each sample was mixed to construct the libraries. These libraries were amplified and purified to remove the excess adapters. They were sequenced on HiSeq 2500 platform (Illumina Inc., San Diego, CA, USA) to generate genome-wide sequence reads.
The sequence reads for the parents and the RILs were obtained as FASTQ files, which were used for SNP discovery using TASSEL version 5.2 (Bradbury et al., 2007) (Supplementary Figure 1). Initially, the perfectly matched barcodes were detected with four bases remnants of the digestion site of the restriction enzyme in the sequencing reads generated for RILs and parental genotypes. Reads were sorted and de-multiplexed using the barcodes. They were trimmed for the first 64 bases starting from the recognition site of the restriction enzyme. Reads containing "N" within the first 64 bases were identified and discarded. Reads passing the quality filtering criteria were mapped onto the reference genome of cultivated peanut A. hypogaea (Bertioli et al., 2019) using the Burrows-Wheeler Alignment (BWA) tool (Li and Durbin, 2009). The mapped reads were exported in the form of Sequence Alignment Map (SAM) file. Furthermore, the alignment file was processed for SNP calling using SNP caller plugin implemented in TASSEL version 5.2.0 GBS v2 pipeline as per the standard instruction (https://bitbucket.org/tasseladmin/tassel-5-source/wiki/Tassel5GBSv2Pipeline). The RILs with <85 Mb data were not processed for further analysis to avoid falsepositives. The SNPs with more than 50% missing data and minor allele frequency (MAF) of ≤0.3 were filtered out to avoid the noise during genetic map construction. The SNPs with <50% missing data for the RILs were imputed using Beagle version 5.2 (Browning et al., 2018) algorithm. Furthermore, filtering was performed to check the percentage heterozygosity and polymorphic SNPs between the parents (Supplementary Figure 1).

Genetic Map Construction
High-quality SNPs obtained after filtering were further considered for genetic analysis. A chi-square (χ 2 ) test was applied on polymorphic AhTE, SSR, and SNP markers with a null hypothesis that two alleles from both parents of RIL population at a particular locus segregate in a 1:1 ratio. The markers showing high segregation distortion (χ 2 -test, P < 0.001) were filtered out and not considered for the linkage map construction. The genetic map was constructed using JoinMap (version 4.0) (Van Ooijen, 2006) with logarithm of odds (LOD) threshold ranging from 3.0 to 10.0 and a minimum recombination threshold of 45%. Grouping and ordering of the markers were performed using the regression mapping algorithm. Kosambi map function (Kosambi, 1943) was used for genetic map construction, and to convert the recombination frequencies into map distances in centiMorgans (cM). Chromosome-wise marker positions with their respective names were used to draw the final genetic map using MapChart (Voorrips, 2002). The mapped markers were also analyzed for their genic and non-genic location and functional annotation (especially for the SNP and AhTE markers).

Main-Effect and Epistatic Quantitative Trait Locus Analysis
The main-effect quantitative trait locus (QTL) analysis was carried out using a "composite interval mapping (CIM)" approach (Zeng, 1994) with Model 6 and scanning distance of 1.0 cM between markers and moving window size of 10.0 cM using Windows QTL Cartographer version 2.5 (Wang et al., 2007). A forward-backward stepwise regression method was used to set the marker cofactors for the background selection. The highest peak was considered to locate QTL where the distance between the peak and the QTL was <5.0 cM. Permutation (1,000) test was performed to work out the threshold and identify the significant QTL. The QTLs with >3.0 LOD and phenotypic variance explained (PVE) >10% were considered as major effect QTLs for a particular trait. Those with PVE <10% were considered as minor effect QTLs. Based on the trait name and chromosome number, the QTLs were named, where the first letter "q" indicated the QTL and the abbreviated capital letters indicated the trait followed by chromosome number and the numerical number indicating the serial number of the QTL for a trait. For instance, qPC-Ah16-1 was the first QTL for PC detected on chromosome Ah16.
Analysis for the epistatic QTL (epiQTL) (Q × Q) was conducted using the function "two-dimensional scanning ICIM-EPI" implemented in inclusive composite interval mapping (ICIM) software version 4.1  with 5 cM step and 0.001 probability mapping parameters in stepwise regression.
The minimum threshold LOD value for significant epiQTL was set at 3.0.

Single Marker Analysis
Association of the markers with the productivity and quality traits was tested by SMA using the lm() function (linear regression) of the R program.

Putative Gene Discovery From Major QTLs/SMA
Putative genes were identified for the major QTLs or QTL clusters. The region between the flanking markers of a particular QTL on the physical map was considered for candidate gene discovery. Where the physical distance between the two flanking markers was more, the marker closer to the peak was selected, and a physical distance of 5 Mb (toward the QTL peak) was searched for the candidate genes. Also, those markers which were identified to be significantly associated with the traits by SMA were checked for their location (genic and non-genic) and effect at PeanutBase (www.peanutbase.com).

Confirmation of QTLs and Markers
A few selected major QTLs identified in this study were validated using other genotypes (GPBD 4, TG 26, TAG 24, ICGV 86699, ICGV 86855, ICGV 06189, DBG 3, and DBG 4). The markers flanking these QTL were used for genotyping as described above. Co-segregation between the marker and the phenotype was checked using the t-test.

GBS Based High-Density Genetic Map
A total of 1,067.58 million raw reads (100.87 GB) were obtained for the 403 RILs and the two parents. On average, 2.42 million reads (0.23 GB) data were generated for each sample. After filtering, a total of 978.96 million reads (82.9 GB) were mapped onto the tetraploid reference genome of cultivated tetraploid peanut "Tifrunner" (Bertioli et al., 2019). In total, 47,584 raw SNPs (mean read depth of 73.7) were extracted for the downstream analysis. Out of these SNPs, 1,205 polymorphic SNPs were identified between the parental genotypes (TMV 2 and TMV 2-NLM). Further filtering based on missing data and segregation distortion identified 713 SNPs high-quality polymorphic SNPs. The missing data for these SNPs ranging from 0.01 (1%) to 0.489 (48.9%) (Supplementary Table 4).
Overall genotypic data available for mapping included 865 markers; comprising of 713 SNPs, 143 AhTEs (105 from Hake et al., 2017 and 38 from this study), and 9 SSRs (Supplementary Table 5). The polymorphism percentage for the SNP, AhTE, and SSR markers were 1.49% (713/47,584), 20.08% (143/712), and 9.89% (9/91), respectively. Of these 865 markers, a total of 700 (including 553 SNPs, 136 AhTEs, and 8 SSRs) were mapped to construct a new genetic map spanning a map distance of 2,438.2 cM for this mapping population (Figure 2A and Supplementary Table 6). The genetic map with 20 linkage groups showed a marker density of 3.48 cM/locus. The number of mapped loci ranged from 20 (Ah07 with a density of 2.65 cM/locus) to 66 (Ah02 with a density of 1.60 cM/locus).
The length of the chromosomes ranged from 53.06 cM (Ah07) to 180.52 cM (Ah13) (Figure 2B and Supplementary Table 5). Overall, the genetic map showed good marker collinearity with a physical map having a few exceptions ( Figure 2C).
three were major-effect QTLs. Of them, the QTL (qSP-Ah13-1) identified on Ah13 had the highest PVE of 52.8% with the LOD score of 35.5 and stability over four seasons (S2, S3, S4, and S5). The second major effect QTL (qSP-Ah16-1) on Ah16 identified over three seasons (S1, S2, and S6) with the highest LOD score of 6.6 showed the highest PVE of 19.9% (Figure 3 and Table 1).
The third major QTL (qSP-Ah02-1) on Ah02 had a PVE of 12.6% with a LOD score of 5.7 was detected only during S3. The remaining seven QTLs identified on Ah01, Ah05, Ah09, Ah10, Ah11, and Ah20 were minor QTLs that appeared only during one or two seasons. The favorable alleles for NPPP and PWPP were contributed by TMV 2-NLM, while TMV 2 contributed the favorable allele at two major QTLs for SP. No major-effect QTL was detected for TW; however, 12 minor-effect QTLs were identified with the highest PVE of 8.5% (qTW-Ah02-4). For OIL, 11 QTLs were identified, of which two were major, and located on chromosome Ah03 (qOIL-Ah03-3) and Ah05 (qOIL-Ah05-1). qOIL-Ah03-3 was detected over four (S1, S2, S4, and S5) seasons with the highest LOD score of 9.5 and PVE of 13.7%. While qOIL-Ah05-1 was detected over the two seasons (S2 and S4) with the highest LOD score of 4.6 and PVE of 10.7%. The favorable alleles at both these QTLs were contributed by TMV 2-NLM (Figure 3 and Table 1).
A total of 47 QTLs were identified for quality traits (PC, OLE, LIN, and O/L) (Figure 3 and Supplementary Table 7). For PC, out of seven QTLs, two QTLs on Ah16 were major and stable across four seasons. Of them, qPC-Ah16-1 had the highest PVE of 13.3% with a LOD score of 15.3, and qPC-Ah16-2 had the highest PVE of 13.3% with a LOD score of 13.5. The favorable alleles for both these QTLs were contributed by TMV 2-NLM (Figure 3 and Table 1). A total of 14 QTLs were identified for OLE along with 13 QTLs each for LIN and O/L. Of them, 12 QTLs for OLE were major and stable with the highest PVE of 21.3%. The remaining two minor QTLs were mapped on Ah10 and Ah16 (Figure 3, Table 1, and Supplementary Table 7). For LIN, all the 13 QTLs were major and stable with the highest PVE of 17.1% (Figure 3, Table 1, and Supplementary Table 7). However, for O/L out of 13 QTLs, 11 were major and stable with the highest PVE of 18.4%. All the major and stable QTLs for OLE, LIN, and O/L clustered on a 26.5 cM region (61.1-87.6 cM) on Ah19 (Figure 3, Table 1, and Supplementary Table 7). It was inferred that TMV 2 contributed to the decreased level of OLE, and increased level of LIN.

Common QTL Clusters for the Productivity and Quality Traits
Three clusters were identified for the productivity and quality traits. Cluster 1 of 3.4 cM (46.7-50.1 cM on Ah02) was common for NPPP, PWPP, and SP. It showed the maximum PVE of 23.6, 20.9, and 12.6% for NPPP, PWPP, and SP, respectively (Figure 3 and Table 1). This region was highly stable for PWPP as it was detected over four seasons (S1, S2, S3, and S4). Also, the additive effects were high for this region, and the favorable alleles for NPPP, PWPP, and SP were contributed by TMV 2-NLM. Cluster 2 of 10.9 cM (7.1-8 cM on Ah16) carried the major QTL for PC and SP with a PVE of 7.9-13.3% and 11.9-19.9%, respectively (Figure 3 and Table 1). Cluster 3 of 26.5 cM (61.1-87.6 cM on Ah19) controlled OLE, LIN, and O/L with the PVE of 5.0-21.3%, 5.5-17.1%, and 6.0-18.4%, respectively, and this region was consistently stable over all the six seasons (Figure 3 and Table 1).

Single Marker Analysis
Single marker analysis revealed that a total of six markers were significantly associated with OLE along with five each for LIN and O/L with PVE ≥ 10. Of them, four markers (Ah19_155127364, Ah19_155135344, Ah19_155135353, and Ah19_155172354) and one (Ah19_155165240) marker located, respectively, on Ah19 and Ah09 were common for OLE, LIN, and O/L. These markers were also identified to be the flanking markers by CIM. These associations were consistent over all the six seasons ( Table 2). In addition, Ah10_36971572 located on Ah10 showed association with OLE only during the S4 season.
A total of 10 markers were significantly associated with PC with a PVE of ≥10. Out of which AhTE0242 and AhTE0060 located at 0-7.16 cM were also identified by CIM. Three markers (Ah12_118126407, AhTE0242, and AhTE0060) showed association in S1 and S4 season, while seven markers (AhTE0281, Ah03_127278448, AhTE0087, AhTE0275, AhTE0120, AhTE1110, and AhTE1451) showed association only during the S1 season ( Table 2). There were a few other markers associated with PC; however, they either showed relatively low PVE or appeared only during the specific seasons. SMA revealed that four markers (AhTE0281, AhTE0087, AhTE0120, and AhTE0242) were significantly associated with SP during the S3 season ( Table 2). However, none of them was in the main effect QTL region detected for SP. For the remaining traits (OIL, NPPP, PWPP, and TW), none of the markers were detected as significant by SMA.
Of the four epistatic interactions for NPPP involving the major and main effect QTLs, three emerged from a genomic region on Ah02 showing significant interaction with the regions on Ah04, Ah06, and Ah12 with the PVE of 15.8, 17.7, and 20.0%, respectively during S4 season. Also, the major QTL for NPPP on Ah04 showed epistatic interactions with its own proximal region (10 cM) during S2, S3, and S4 seasons with a maximum PVE of 28.5% (Figure 3 and Table 3). Out of the remaining five epiQTLs for NPPP, those on Ah06, Ah18, and Ah19 showed interactions with their own close proximal regions (20, 5, and 15 cM) with the highest PVE of 39.6, 13.6, and 35.8%, respectively. Furthermore, the epiQTLs on Ah03 and Ah10 showed significant interactions with genomic regions on Ah16 and Ah19 with maximum PVE of 13.4 and 11%, respectively for NPPP (Figure 3 and Supplementary Table 8).
Of the four epistatic interactions for SP involving the major and main effect QTLs, a main effect QTL region on Ah16 (7.1-18 cM) showed epistatic interactions with Ah03, Ah12, and Ah13 with the maximum PVE of 24.2, 23.5, and 23.9%, respectively. Furthermore, a region (159.3-178.3 cM) on Ah13 with major main effect QTL also showed epistatic interactions with Ah08 recorded maximum PVE of 23.2% (Figure 3 and Table 3). In addition, a minor main effect QTL region on Ah01 for SP showed significant interaction with consecutive regions (at 60 and 80 cM) on Ah03 with the highest PVE of 22.3 and 22.6%. Among the remaining 47 epiQTLs, regions on Ah01, Ah05, Ah13, and Ah19 for SP were also involved in epistatic interaction with their own close proximal regions with PVE of 39.7,30.4,32.4,and 31.6%, respectively (Figure 3 and Supplementary Table 8).  For PWPP, none of the main-effect QTL was involved in epistatic interactions. However, the new epiQTL regions on Ah04, Ah05, Ah06, and Ah13 showed significant interactions with their own close proximal regions (5 cM) with the highest PVE of 36.4, 29.5, 43.3, and 26.7%, respectively. Apart from these, eight other epiQTLs appeared in at least two seasons with the major effect (PVE ≥10%) (Figure 3 and Supplementary Table 8).
Of the 14 epiQTLs detected for TW, a main effect minor QTL on Ah12 was involved in epistatic interaction with regions on Ah03, Ah07, and Ah14 with the highest PVE of 24.8, 28.5, and 15.8%, respectively. Among the remaining 11 epiQTLs, regions on Ah01, Ah03, Ah05, and Ah11 were also involved in epistatic interactions with maximum PVE of 17.6, 24.8, 28.4, and 14.3%, respectively. Most of these genomic regions identified in this study were important since they carried stable major QTL(s) which also showed significant epistatic interaction for various traits with PVE ≥ 10% across the seasons (Figure 3 and Supplementary Table 8).

Putative Genes Identified in Major Main-Effect QTL Regions/Clusters
In total, the three clusters and four major and stable QTL regions were subjected to candidate gene discovery. In cluster 1, a 5 Mb region from the left flanking marker (Ah02_100281747) toward the common QTL peak for NPPP, PWPP, and SP was considered for gene discovery, and 360 genes were found (Table 4). In clusters 2 and 3, the region between the left and right flanking markers were considered, and 34 and 3 genes were found in the regions, respectively (Table 4).
Of the four major QTL regions, a 5 Mb region from the left flanking marker to the QTL peak was employed for three; qPC-Ah16-1, qOIL-Ah03-3, and qOIL-Ah05-1, and 249, 421, and 333 genes were found in these regions, respectively ( Table 4). The region between the two flanking markers was considered for qSP-Ah13-1, and 259 genes were found (Table 4). However, more studies are required to identify the candidate genes contributing to these traits.
Sixteen markers that were identified to be significantly associated with the traits by single marker analysis were checked for their location (genic and non-genic), effect, and probable function (Supplementary Table 9). Of them, 10 were found to be located in the intergenic regions and two each were located in the exonic, 5 ′ UTR, and intronic regions (Supplementary Table 9). AhTE0281 being located in the 16th exonic region of Arahy.7A57YA on Ah16 contributed for SP and PC (Supplementary Table 9). In addition, Ah12_118126407 being located in the second exon of Arahy.J5SZ1I (Ah12) governed PC. AhTE1451 being located in the 5 ′ UTR of Arahy.CH9B83 (Ah18) also governed PC (Supplementary Table 9). Similarly, an SNP at 155172354 bp being located in the 5 ′ UTR of the gene Arahy.X7PJ8H on Ah19 contributed for OLE, LIN, and O/L (Supplementary Table 9). Also, both Ah19_155135344 and Ah19_155135353 being located in the 11th intron of Arahy.MZJT69 on Ah19 contributed for OLE, LIN, and O/L (Supplementary Table 9).

Confirmation of the QTLs and Markers
The two stable major QTLs qPC-Ah16-1 and qOIL-Ah03-3 were selected for validation using the other eight genotypes (Supplementary Table 10). The closest flanking markers; AhTE0242 for qPC-Ah16-1 and AhTE1144 for qOIL-Ah03-3 were used for genotyping. The t-test was significant (p < 0.05) for AhTE0242 and AhTE1144 markers, indicating a strong validation of the markers and thereby the QTL for PC and OIL ( Table 5).

DISCUSSION
In our previous study, the RIL population derived from TMV 2 and TMV 2-NLM was used for constructing the AhTE markerbased genetic map and identifying the QTL for important taxonomic and productivity traits (Hake et al., 2017). Since the parents and the RILs also differed for quality traits, an effort was made in the present study to map the quality traits using an improved genetic map with extensive multi-season phenotypic data on the productivity and quality traits collected over six seasons and GBS-derived SNP data. In this population, GBS could identify more number of SNPs (713) polymorphic between TMV 2 and TMV 2-NLM than the number of SNPs (31 SNP loci) detected using the ddRAD-Seq in the previous study (Hake et al., 2017). This could be due to the differences in the methodology, especially the use of a four-base cutter and a six-base cutter in ddRAD-Seq, while only a four-base cutter in GBS for generating the DNA fragments. With the 865 markers available for mapping, a total of 700 markers loci were mapped on the genetic map of 2,438.1 cM. The map density was increased to 3.5 cM/loci as compared with a previous genetic map where a total of 91 marker loci were mapped onto a genetic map of 1,205.6 cM with 18.1 cM/loci map density (Hake et al., 2017). In the previous genetic mapping studies with GBS or WGRS (whole genome re-sequencing) or SNP array, the diploid reference genomes of Arachis duranensis and Arachis ipaensis were used for SNP calling (Dodia et al., 2019;Gangurde et al., 2020). However, the present study used the tetraploid peanut (A. hypogaea) genome (Bertioli et al., 2019) as the reference for the true representative SNP calling.
The genomic regions controlling NPPP, PWPP, SP, TW, PC, OIL, OLE, LIN, and the OLE to LIN ratio (O/L) were identified using the phenotypic data generated over six seasons and the newly constructed improved genetic map with SNP, AhTE, and SSR markers. In this study, the main effect QTLs with major contributions (>10% PVE) were detected for all the traits except for TW. Likewise, the genomic region showing epistatic interactions for productivity traits (NPPP, PWPP, SP, and TW) were also identified. It was noticed that the traits identified with the major QTL showed higher GCV and broad sense heritability. The QTLs for highly correlated traits, such as OLE, LIN, and O/L shared a common marker interval on chromosome Ah19. Similarly, QTLs for NPPP, PWPP, and SP shared common marker interval QTLs on Ah02 and for PC and SP on chromosome Ah16. Though G × E interactions were significant for all the traits, stable QTL regions could be  Phenotypic values and allelic pattern of genotypes considered for QTL validation given in Supplementary Table 9.
detected for the majority of the traits in this study. Based on the stability of QTLs across the seasons, we identified the QTL clusters and markers for validation and subsequent deployment in molecular breeding for improving the traits. The QTL region flanked by Ah02_100281747-Ah02_1558084 on chromosome Ah02, either through its main effect or epistatic interactions, showed significant contribution for NPPP (through qNPPP-Ah02-1), PWPP (through qPWPP-Ah02-1), and SP (through qSP-Ah02-1). The QTL regions on Ah06 and Ah19 were also important for NPPP. A QTL on Ah06 was important for PWPP only through its epistatic interaction. QTLs on chromosomes Ah13 and Ah16 for SP showed main as well as epistatic effects, while the same and its consecutive region on Ah16 also showed main QTL for PC. Therefore, selection based on the QTL region at 7.1-18.0 cM on chromosome Ah16 might improve not only SP but also PC. This was also supported by the significant positive correlation between SP and PC that was observed in this study and the previous study (Kumar et al., 2014). Moreover, SMA also showed that four markers contributed to both SP and PC. Furthermore, validation of these markers across the seasons and genotypes might indicate their utility in the markerassisted breeding for simultaneous improvement of PC and SP since seasonal variation for PC has been reported earlier (Sarvamangala et al., 2011). The main effect QTLs on Ah12 for TW also showed epistatic interactions with genomic regions on Ah05 and Ah14. This might help in transferring the main effect and epiQTLs simultaneously to improve kernel weight. The selection based on the main effects of the QTLs on chromosome Ah03 (qOIL-Ah03-3) and Ah05 (qOIL-Ah05-1) could advance the genetic gains for OIL. Similarly, the main effect of the QTL clusters at 61.1-87.6 cM on chromosome Ah19 could contribute to improving O/L (increased OLE and decreased LIN). Single marker analysis also showed the significant association of five markers from this region with OLE, LIN, and O/L stably across the seasons. The QTL regions in the close vicinity on a few chromosomes (Ah01, Ah04, Ah05, Ah06, Ah13, and Ah19) showing epistatic interaction for the productivity traits might be resolved by fine mapping so that the selection becomes more effective. Parent TMV 2-NLM could be considered as the source of favorable allele at the region on Ah02 which contributed for NPPP, PWPP, SP, and TW. In addition, the favorable allele from TMV 2 at 19 cM region on Ah13 might be considered while selecting for SP. Two of the QTL regions identified in this study were validated using other genotypes. A region 7.1-18.0 cM on chromosome Ah16 for PC and 37.4-37.5 cM region on chromosome Ah03 for OIL showed strong validation, indicating that these QTLs are genotype-independent. Many QTLs were also consistent as they were reported to be co-localized in the previous studies thereby supporting their utility. The region at 46.7-50.1 cM on chromosome Ah02 identified for NPPP, PWPP, SP, and TW in this study was previously detected for pod length (Fonceka et al., 2012), seed length , and SP (Chavarro et al., 2020). Similarly, the 37.4-37.5 cM region on chromosome Ah03 linked to OIL was reported by Sarvamangala et al. (2011). The single region on Ah19 linked to OLE, LIN, and O/L was consistent with the study of Pandey et al. (2014) and Shasidhar et al. (2017). However, a stable QTL region (159.3-178.3 cM) on Ah13 reported in this study for SP with the LOD score of 3.2-35.5 and PVE of 16.3-52.8% over four seasons differed from the region (60.3-64.7 cM) reported by Zhang et al. (2019) for seed length.
With the availability of the genome sequence for the diploid ancestors and the cultivated peanut now (Bertioli et al., 2019), candidate gene discovery is relatively easy as it has been reported for SP (Luo et al., 2017), seed weight (Gangurde et al., 2020), TW , stem rot resistance (Dodia et al., 2019), and foliar disease resistance . Here, putative gene discovery was performed in the three QTL clusters and four major QTL regions; 3.4 cM region on Ah02 chromosome identified for NPPP, PWPP, SP, and TW, 10.9 cM region on Ah16 for SP and PC, and 26.5 cM region on Ah19 for OLE, LIN, and O/L. There were 360 predicted genes in the 3.4 cM region on Ah02, while 34 genes were identified in the 10.9 cM region on Ah16. The 26.5 cM QTL cluster on Ah19 had only three predicted genes. Furthermore, this region was in the vicinity of FAD2B gene that determines OLE and LIN content and therefore used widely for marker-assisted breeding (Jadhav et al., 2021).
Putative gene discovery was also performed in the four stable major QTL regions, such as qSP-Ah13-1 at 19 cM on Ah13 for SP, qSP-Ah16-1 at 7.1 cM on Ah16 for PC, and QTLs (qOIL_Ah03-3 at 0.1 cM on Ah03 and qOIL_Ah05-1 at 7.1 cM on Ah05) for OIL. Since the flanking markers for these QTLs were distanced quite apart, the markers close to the peak were selected, and a physical distance of 5 Mb toward the QTL peak was searched for the predicted genes. This effort identified a large number of genes (249-421) across the four QTL regions. Therefore, it may be too primitive to conclude about the candidate genes for the traits observed in this study, and fine mapping might be essential to resolve the regions and identify the candidate genes for effective use in marker-assisted breeding.
Also, the marker loci associated with the traits identified through SMA were further considered for identifying the candidate genes. In total, five candidate genes could be identified; they included Arahy.7A57YA (coding for ARM repeat superfamily protein) for SP and PC, Arahy.J5SZ1I (coding for syntaxin of plants) and Arahy.CH9B83 (coding for phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase PTEN-like isoform) for both SP and PC, and Arahy.MZJT69 and Arahy.X7PJ8H for OLE, LIN, and O/L. Arahy.MZJT69 (coding for receptorlike protein kinase 4) and Arahy.X7PJ8H (coding for protein kinase superfamily) altered the phenotype probably through SNP, while Arahy.7A57YA (coding for ARM repeat superfamily protein) and Arahy.CH9B83 (coding for phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase PTEN-like isoform) genes contributed to the phenotype probably through the transpositional activity of AhMITE1 as reported earlier with AhTE0391 marker in Aradu.7N61X (coding for alpha-glucosidase) (Hake et al., 2017).
Overall, this study contributed to the development of an improved map with 700 markers for a unique mapping population derived from an elite variety TMV 2 and its mutant, which probably offers a greater opportunity for subtracting a major portion of the genome common to both the parents and considering probably a small portion of the genome that differs between the parents for mapping the traits. This fact was pronounced both in this study as well as the previous study (Hake et al., 2017), which together reported the mapping of taxonomical, productivity, and quality traits in peanut.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: BioProject: PRJNA744708.

AUTHOR CONTRIBUTIONS
MJ conducted phenotyping, genotyping with AhTE and SSR markers, analyzed the data, and contributed to manuscript draft preparation. AY and SM assisted in generating the phenotypic and genotypic data. SG analyzed data and contributed to manuscript preparation. AH collected the data on phenotyping in two seasons and generated the AhTE marker data. SP and MG developed the mapping population and contributed to planning the study. KS performed SNP effect analysis. RV contributed for generating GBS data. MP contributed to generating GBS data, guiding genetic analysis, and planning manuscript content. RB conceptualized the study, planned manuscript content, coordinated with co-authors, and finalized the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
The funding support for genotyping-by-sequencing (GBS) performed in this study was received from National Agricultural Science Fund (NASF) of Indian Council of Agricultural Research, India.