Newly Developed MAGIC Population Allows Identification of Strong Associations and Candidate Genes for Anthocyanin Pigmentation in Eggplant

Multi-parent advanced generation inter-cross (MAGIC) populations facilitate the genetic dissection of complex quantitative traits in plants and are valuable breeding materials. We report the development of the first eggplant MAGIC population (S3 Magic EGGplant InCanum, S3MEGGIC; 8-way), constituted by the 420 S3 individuals developed from the intercrossing of seven cultivated eggplant (Solanum melongena) and one wild relative (S. incanum) parents. The S3MEGGIC recombinant population was genotyped with the eggplant 5k probes SPET platform and phenotyped for anthocyanin presence in vegetative plant tissues (PA) and fruit epidermis (FA), and for the light-insensitive anthocyanic pigmentation under the calyx (PUC). The 7,724 filtered high-confidence single-nucleotide polymorphisms (SNPs) confirmed a low residual heterozygosity (6.87%), a lack of genetic structure in the S3MEGGIC population, and no differentiation among subpopulations carrying a cultivated or wild cytoplasm. Inference of haplotype blocks of the nuclear genome revealed an unbalanced representation of the founder genomes, suggesting a cryptic selection in favour or against specific parental genomes. Genome-wide association study (GWAS) analysis for PA, FA, and PUC detected strong associations with two myeloblastosis (MYB) genes similar to MYB113 involved in the anthocyanin biosynthesis pathway, and with a COP1 gene which encodes for a photo-regulatory protein and may be responsible for the PUC trait. Evidence was found of a duplication of an ancestral MYB113 gene with a translocation from chromosome 10 to chromosome 1 compared with the tomato genome. Parental genotypes for the three genes were in agreement with the identification of the candidate genes performed in the S3MEGGIC population. Our new eggplant MAGIC population is the largest recombinant population in eggplant and is a powerful tool for eggplant genetics and breeding studies.

Multi-parent advanced generation inter-cross (MAGIC) populations facilitate the genetic dissection of complex quantitative traits in plants and are valuable breeding materials. We report the development of the first eggplant MAGIC population (S3 Magic EGGplant InCanum, S3MEGGIC; 8-way), constituted by the 420 S3 individuals developed from the intercrossing of seven cultivated eggplant (Solanum melongena) and one wild relative (S. incanum) parents. The S3MEGGIC recombinant population was genotyped with the eggplant 5k probes SPET platform and phenotyped for anthocyanin presence in vegetative plant tissues (PA) and fruit epidermis (FA), and for the light-insensitive anthocyanic pigmentation under the calyx (PUC). The 7,724 filtered high-confidence single-nucleotide polymorphisms (SNPs) confirmed a low residual heterozygosity (6.87%), a lack of genetic structure in the S3MEGGIC population, and no differentiation among subpopulations carrying a cultivated or wild cytoplasm. Inference of haplotype blocks of the nuclear genome revealed an unbalanced representation of the founder genomes, suggesting a cryptic selection in favour or against specific parental genomes. Genome-wide association study (GWAS) analysis for PA, FA, and PUC detected strong associations with two myeloblastosis (MYB) genes similar to MYB113 involved in the anthocyanin biosynthesis pathway, and with a COP1 gene which encodes for a photoregulatory protein and may be responsible for the PUC trait. Evidence was found of a duplication of an ancestral MYB113 gene with a translocation from chromosome 10 to chromosome 1 compared with the tomato genome. Parental genotypes for the three genes were in agreement with the identification of the candidate genes performed in the S3MEGGIC population. Our new eggplant MAGIC population is the largest recombinant population in eggplant and is a powerful tool for eggplant genetics and breeding studies.

INTRODUCTION
Multi-parent experimental populations are of great interest for the genetic dissection of quantitative traits and for the development of new recombinant materials for plant breeding . Despite their complex management and resources requirement, multi-parent advanced generation intercross (MAGIC) populations represent powerful next-generation mapping tools by combining a high genetic diversity and recombination with a low population structure (Arrones et al., 2020;Scott et al., 2020). MAGIC populations are already available in model species, such as Arabidopsis thaliana and in several crops, such as cereals, pulses, and vegetables (Kover et al., 2009;Bandillo et al., 2013;Pascual et al., 2015;Huynh et al., 2018), and have demonstrated their power to dissect the structure of complex traits (Dell'Acqua et al., 2015;Stadlmeier et al., 2018).
Although the available MAGIC populations have become a useful resource for genetic studies and breeding, most of them have only exploited intraspecific variation. The incorporation of crop wild relatives (CWRs) as founders could be a way of including multiple wild genomic fragments or introgressions into cultivated background genomes (Arrones et al., 2020). Apart from being of great interest for genetic analysis, interspecific MAGIC populations can be useful for broadening the genetic base of crops and provide new variation for breeding multiple traits, including those related to adaption to climate change (Gramazio et al., 2020a). However, so far, the potential of interspecific MAGIC populations for plant breeding has largely remained unexploited (Arrones et al., 2020).
Eggplant (Solanum melongena L.) is a major vegetable crop of increasing importance, ranking fifth in global production among vegetables (FAOSTAT, 2019). Despite its economic importance, eggplant has lagged behind other major crops and little effort has been made to develop immortal experimental populations and genetic and genomic tools (Gramazio et al., , 2019. So far, only one population of recombinant inbred lines (RIL) and one set of introgression lines (ILs) are publicly available (Lebeau et al., 2013;Gramazio et al., 2017a), while no multiparent population has been developed, so far. Conversely, in other related Solanaceae crops, such as tomato, several experimental populations have been developed, including MAGIC populations, which have allowed great advances in the genetic dissection of traits of interest (Pascual et al., 2015;Campanelli et al., 2019). For this reason, the development of this type of population would represent a landmark in eggplant breeding.
Anthocyanins are responsible for a set of specific and relevant traits in eggplant which can be used as a model plant for other crops (Moglia et al., 2020). Anthocyanins play a key role in plant defence mechanisms, and their synthesis and accumulation may vary in response to specific biotic and abiotic stresses (D'Amelia et al., 2018;Lv et al., 2019;Zhou et al., 2019). In addition, anthocyanins may prevent and alleviate human chronic diseases and provide health benefits (Toppino et al., 2020). In eggplant, anthocyanins are responsible for the purple colour of the peel, one of the traits of greatest interest for eggplant breeding (Daunay and Hazra, 2012). Purple coloured eggplant fruits are the most demanded in various markets , and developing dark purple-coloured eggplants, which results from the combination of anthocyanins with chlorophylls, is a major objective in eggplant breeding programmes. Eggplants are variable for the presence of anthocyanins not only in fruits, but also in other plant organs and tissues such as hypocotyl, stem, leaves, leaf veins, prickles, flower calyx, or corolla (Toppino et al., 2020). Anthocyanin biosynthesis has been widely studied in Solanaceae species (Van Eck et al., 1994;Borovsky et al., 2004;Gonzali et al., 2009), but its genetic control in eggplant has not been fully clarified. The anthocyanin biosynthetic pathway is a very conserved network in many plant species, where many enzymes and regulatory transcription factors (TFs) are involved (Albert et al., 2014).
Here, we report on the first eggplant MAGIC population derived from an interspecific cross of seven accessions of S. melongena and its wild relative S. incanum (Gramazio et al., 2019). It represents the largest experimental population described, so far, in eggplant, with a similar population size to MAGIC populations in other solanaceous crops. The population has been genotyped by applying the Single Primer Enrichment Technology (SPET) to explore its genetic architecture and the contribution of founders to the final population, and phenotyped for the presence of anthocyanins in the fruit epidermis and other plant organs and for the PUC trait. These traits were chosen due to their physiological, agronomic, and morphological relevance, high stability, and heritability. An association analysis has been performed to locate the genomic regions and to identify the candidate genes involved in the traits under study.

Multi-Parent Advanced Generation Inter-Cross Population Construction
The eggplant MAGIC population has been developed by intermating seven cultivated eggplants, i.e., MM1597 (A), DH ECAVI (B), AN-S-26 (D), H15 (E), A0416 (F), IVIA-371 (G) and ASI-S-1 (H), and the S. incanum accession MM577 (C) (Figure 1A). The wild relative founder was chosen for its tolerance to some biotic and abiotic stresses, mainly drought (Knapp et al., 2013), and for showing a high phenolic content . The performance of the founders was comprehensively characterised in previous morphoagronomic and genetic diversity studies (Hurtado et al., 2014;Gramazio et al., 2017b;Kaushik et al., 2018). In addition, their genomes had been resequenced (Gramazio et al., 2019). The latter study highlighted that, in the founder parents, the residual heterozygosity was less than 0.06%.
In order to develop the eggplant S3 Magic EGGplant InCanum (S3MEGGIC) population, founder lines had been inter-crossed by following a simple "funnel" approach (Wang et al., 2017;Arrones et al., 2020; Figure 1A). The eight founders (A-H) had been pairwise inter-crossed to produce two-way or simple F1 hybrids (AB, CD, EF, and GH), which were subsequently inter-crossed in pairs (AB × CD and EF × GH) to obtain two four-way or double hybrids (ABCD and EFGH). In order to achieve a complete admixture of all founder genomes and to avoid assortative mating, the double hybrids were intercrossed following a chain pollination scheme, with each individual being used as female and male parents (Díez et al., 2002;Supplementary Figure 1A).
All the obtained eight-way or quadruple hybrids (S0 generation) presented all the eight randomly shuffled genomes and only differed on the cytoplasm inherited from the maternal parent. The S0 progenies obtained using the double hybrid ABCD, as a female parent carried the cytoplasm of the wild S. incanum MM577, while those derived using the double hybrid EFGH as a female parent carried the cytoplasm of the cultivated S. melongena ASI-S-1. Subsequently, the S0 progenies were selfed for three generations by a single seed descent (SSD) to obtain the S3 segregating individuals that were phenotyped and genotyped in this study. To ensure the continuity of the S0 progenies and to accelerate the self-fertilisation process, four plants of each S0 progeny were germinated, and only the first two that set a viable seed were selected for the next generation (S1; Supplementary Figure 1B). From each of the two S0 selected plants, two S1 plants were germinated, and only the first setting fruit was selected for the S2 generation. The same was done for the S3 generation so that for each progeny, two plants were germinated and phenotyped. Despite this, only one was used for originating the next generation. On the other hand, in the S3 progenies, if two individuals displayed some phenotypic differences, both of them were included in the S3MEGGIC population.

Cultivation Conditions
Seeds were germinated in Petri dishes, following the protocol developed by Ranil et al. (2015) and, subsequently, transferred to seedling trays in a climatic chamber under a photoperiod and temperature regime of 16 h light (25 • C) and 8 h dark (18 • C). After acclimatisation, plantlets were transplanted to 15 L pots and grown in a pollinator-free benched glasshouse of the Universitat Politècnica de València (UPV), Valencia, Spain (GPS coordinates: latitude, 39 • 28 55 N; longitude, 0 • 20 11 W; 7 m above sea level). Plants were spaced 1.2 m between rows and 1 m within the row, and fertirrigated using a drip irrigation system and trained with vertical strings. Pruning was done manually to regulate the vegetative growth and flowering. Phytosanitary treatments were performed when necessary. In order to shorten generation time of subsequent generations (S0-S3), plantlets were transplanted to individual thermoformed pots (1.3 L capacity) in a pollinator-free glasshouse, and selfings were stimulated by a mechanical vibration.

High-Throughput Genotyping
Young leaf tissue was sampled from 420 S3 individuals, the eight founders, and the four two-way hybrids. Genomic DNA was extracted using the silica matrix extraction (SILEX) extraction method  and checked for quality and integrity by agarose electrophoresis and NanoDrop ratios (260/280 and 260/230), while its concentration was estimated with Qubit 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, United States). After dilution, the samples were sent to Identity Governance and Administration (IGA) Technology Services (IGATech, Udine, Italy) for library preparation and sequencing with NextSeq500 sequencer (150 paired-end) for a high-throughput genotyping using the SPET technology using the 5k probes eggplant SPET platform (Barchi et al., 2019a). The latter comprises 5,093 probes, and was developed by filtering out the most informative and reliable polymorphisms (3,372 of them in coding sequences or CDS and 1,721 in introns and untranslated regions or UTR regions) from the set of over 12 million single-nucleotide polymorphisms (SNPs) identified among the MAGIC founders (Gramazio et al., 2019).
Raw reads were demultiplexed and the adapters were removed using a standard Illumina pipeline and Cutadapt (Martin, 2011), while trimming was performed by ERNE (Del Fabbro et al., 2013). Clean reads were mapped onto the eggplant reference genome "67/3" (Barchi et al., 2019b) using BWA-MEM (Li, 2013) with default parameters, and only the uniquely aligned reads were selected for the variant calling performed with GATK 4.0 (DePristo et al., 2011), following the best practice recommended by the Broad Institute. 1 The SNPs identified by SPET were filtered using the Trait Analysis by Association, Evolution, and Linkage (TASSEL) software (ver. 5.0, Bradbury et al., 2007) in order to retain the most reliable ones (minor allele frequency > 0.01, missing data < 10% and maximum marker heterozygosity < 70%). In addition, a linkage disequilibrium (LD) k-nearest neighbour genotype imputation method (LD KNNi) was performed to fill the missing calls or genotyping gaps.

Population Structure, Heterozygosity, and Haplotype Blocks Inferring
A principal component analysis (PCA) was performed to assess the population structure of S3MEGGIC using the R package vcfR (Knaus and Grünwald, 2017) and the function glPCA of the Adegenet package (Jombart, 2008). Finally, the PCA was graphically plotted with ggplot2 (Wickham, 2016). An Analysis of Molecular Variance (AMOVA) was performed to estimate the population differentiation according to the cytoplasm (cultivated vs. wild) of the individuals of the S3MEGGIC population by using the function poppr.amova of the poppr R package (Kamvar et al., 2014). The residual heterozygosity and its distribution were evaluated with TASSEL software (ver. 5.0, Bradbury et al., 2007). Parental contribution to S3MEGGIC individuals and haplotype blocks were estimated by using R-package HaploBlocker (Pook et al., 2019).

Phenotyping and Genome-Wide Association Study
Phenotypic data were collected from the 420 S3 individuals grown during the 2019/2020 season. Three traits were screened using a binary classification (presence/absence), namely, anthocyanins presence in vegetative plant tissues (PA) and fruit epidermis (FA), and anthocyanic PUC (Supplementary Figure 4). The presence of PA was phenotyped when the purple coloration was observed in any vegetative plant parts such as in stem, branches, leaf veins, or prickles. For FA, anthocyanins were considered as present when the purple colouration was observed in the fruit epidermis regardless of their distribution (uniform, listed, etc.) or intensity. The PUC trait could only be phenotyped in anthocyanic fruits by removing the calyx and observing the presence of anthocyanins under it. FA and PUC traits were screened at the stage of commercial maturity (e.g., when the fruit was physiologically immature) which is the best stage for phenotyping these traits. Phenotypic characteristics of the eight founders and two-way hybrids are described in Figure 1A.
Using the phenotypic and genotypic data collected from the S3MEGGIC individuals, a Genome-Wide Association Study (GWAS) was performed for the selected traits using the TASSEL software (ver. 5.0, Bradbury et al., 2007). For the association study, mixed linear model (MLM) analyses were conducted. The MLM analysis uses both fixed and random effects, which incorporates the kinship among the individuals. The multiple testing was corrected with the Bonferroni and the false discovery rate (FDR) methods (Holm, 1979;Benjamini and Hochberg, 1995) to identify candidate-associated regions at the significance level of 0.05 (Thissen et al., 2002). SNPs with limit of detection (LOD) [−log10(p-value)] over these thresholds or cut off values were declared to be significantly associated with the anthocyanin presence. The R qqman package (Turner, 2018) was used to visualise the Manhattan plots and LDBlockShow (Dong et al., 2020) to determine the LD and the plot haplotype block structure. The pattern of pairwise LD between SNPs was measured by LD correlation coefficient (r 2 ) by considering haplotype blocks with default r 2 values greater than 0.5 supported by the solid spine of LD method (Gabriel et al., 2002;Barrett et al., 2005). The LD was used to narrow down the genomic regions with significant associations. The genes underlying the associated regions were retrieved from the "67/3" eggplant reference genome (ver. 3) (Barchi et al., 2019b). Genes were considered as potential candidates in controlling the assessed traits when carrying homozygous allelic variants classified as "high impact" according to SnpEff software v 4.2 prediction (Cingolani et al., 2012) of the eight MAGIC founders (Gramazio et al., 2019). The Integrative Genomics Viewer (IGV) tool was used for visual exploration of founder genome sequences to validate SnpEff results and to confirm the presence of the so-called "high impact" variants (Robinson et al., 2020). In addition, eggplant and tomato syntenies were assessed by a BLASTx search of candidate genes sequences against the tomato genome (version SL4.0) in the Sol Genomics Network database. 2

Multi-Parent Advanced Generation Inter-Cross Population Construction
Seven accessions of eggplant and one of the wild relative S. incanum were selected as founder parents (A-H) for the construction of the eggplant MAGIC population (S3MEGGIC). Following a funnel breeding scheme ( Figure 1A and Supplementary Figure 1), a total of 420 individuals of the MAGIC populations were obtained. First, founders were pairwise inter-crossed to produce four two-way hybrids (AB, CD, EF, and GH), which were subsequently inter-crossed in pairs to obtain four-way hybrids (ABCD and EFGH). One-hundred and forty-nine individuals of each of the two four-way hybrids were inter-crossed using a chain pollination scheme (Supplementary Figure 1). Out of the theoretical maximum of 298 eight-way hybrid progenies (S0), seeds were obtained for 209 of them, of which 116 carried the S. melongena ASI-S-1 cytoplasm and 93 carried the S. incanum MM577 cytoplasm. Two plants per S0 progeny were used to advance the population, reaching 402 S1 progenies. These S1 progenies were advanced through single seed descend (SSD) to obtain the 391 S2 and 305 S3 MAGIC progenies. The final S3MEGGIC population was constituted by 420 S3 individuals, of which 348 individuals carried the cultivated cytoplasm and 72 the wild cytoplasm.

Single Primer Enrichment Technology Genotyping
The genotyping of the 420 S3 MAGIC individuals, the eight founders, and the four two-way hybrids by the eggplant SPET platform yielded 22,146 SNPs. After filtering, 7,724 high-confidence SNPs were retained for the subsequent analysis, and the low percentage of missing calls (0.53%) was imputed. Filtered SNPs were distributed across the entire  1 | Statistics of the genotyping using the eggplant 5k probes Single Primer Enrichment Technology (SPET) platform of the 420 S3MEGGIC population individuals using the "67/3" eggplant reference genome (Barchi et al., 2019b). FIGURE 2 | Heterozygosity and proportion of missing data from the S3MEGGIC population. The top histogram represents the observed heterozygosity which is skewed to the left (mean 6.87%; mode 5.02%). The lower graph reports the residual heterozygosity of S3 individuals as black dots copared against two-way hybrids represented as red dots.

Population Structure
Population stratification, performed by PCA, indicated the absence of subgroups in the S3 individuals since no clear clustering was observed (Figure 3). The first two principal components (PCs) account for 5.15% (PC1) and 3.30% of the total variation, respectively, while the first 10 PCs explain together only 25.14% of the total variation, revealing the absence of genetic structure in the S3MEGGIC population.
No differentiated clusters among individuals carrying the wild (S. incanum MM577) or the cultivated (S. melongena ASI-S-1) cytoplasm were observed. An Analysis of Molecular Variance (AMOVA) was also performed, revealing that only 0.29% of the total sums of squares is accounted for the molecular variation among the S. melongena and S. incanum cytoplasm groups, thereby resulting in a very small phi-value of 0.0019, which indicates a low level of differentiation and supporting that no population structure exists. The genome mosaics reconstruction of the S3 MAGIC individuals, in terms of the eight founder haplotypes, showed different haplotype block proportions depending on the genomic position for all chromosomes (Figure 4). The estimated contribution of some founders to the overall S3MEGGIC population differed from the expected value of 12.5%. Two of the founder genomes (A0416 and IVIA-371) had a high representation in the genome of the S3 individuals (32.6% and 23.6%, respectively), while two others (AN-S-26 and H15) had a small representation (0.3% in both cases). The wild founder, S. incanum, had an average haplotype representation of 5.8%.

Phenotypic Variation Among Multi-Parent Advanced Generation Inter-Cross Individuals and Association Analysis
The screening for absence or presence of PA, FA, and PUC in purple fruits of the 420 S3 MAGIC individuals revealed a considerable variation (Figure 1B). Out of the 420 S3 individuals, 57.6% displayed PA and 37.5% FA. Among the individuals displaying FA, 64.3% had the PUC phenotype.
Given that no population structure was observed for the S3MEGGIC population, the phenotypic data, together with the genotypic information, were used for GWAS analysis (Figure 5). The GWAS was performed taking into account the kinship in an MLM leading to the identification of significant associations for the evaluated traits.

Plant Anthocyanins
The Manhattan plot for PA revealed two major peaks on chromosome 9 and 10 (PA9 and PA10), with seven significant SNPs over the FDR threshold (LOD > 4.09), and three of them over the Bonferroni threshold (LOD > 5.16) (Figure 5A). One SNP (LOD = 4.25) was mapped on PA9 region (between 17.0 and 17.4 Mb; Supplementary Figure 2A) and six SNPs, with an LOD between 4.31 and 6.48, were mapped on a PA10 region (between 91.08 and 94.81 Mb; Figure 5A).

Pigmentation Under the Calyx
Among the three traits evaluated, PUC was the one with the highest number of significant associations. The Manhattan plot for PUC revealed seven main peaks on chromosomes 1 (PUC1), 3 (PUC3.1 and PUC3.2), and 10 (PUC10.1-10.4), with 36 SNPs over the FDR threshold (LOD > 3.7), and 23 of them are above the Bonferroni threshold (LOD > 5.16) (Figure 5F). One SNP with an LOD of 4.21 was located on the PUC1 region (between 5.11 and 5.88 Mb) in position 5,480,282 ( Figure 5G), 133,305 bp away of the significant SNP on the FA1 region ( Figure 5D) Figure 5H). Another twelve SNPs were located on the PUC10.3 region, between 91.08 and 94.81 Mb (LOD between 3.86 and 12.44; Figure 5I), as observed for PA ( Figure 5B) and FA ( Figure 5D). The 16 remaining SNPs (LOD between 3.71 and 7.93) were found in the PUC10.4 region, between 98.25 and 100.55 Mb (Supplementary Figure 2F).

Candidate Genes for Anthocyanin Biosynthesis
Based on the results of the GWAS analysis, putative candidate genes were identified close to or within the LD blocks defined in the genomic regions with significant associations (Supplementary Table 1). On chromosome 10, in the genomic region (91.08-94.81 Mb) associated with all the evaluated traits (PA10, FA10, and PUC10.3), a candidate gene was identified as similar to MYB113 (SMEL_010g351850.1), a well-known regulatory transcription factor controlling anthocyanin synthesis in eggplant Shi et al., 2021). In addition, in the FIGURE 4 | Genome-wide founder haplotype blocks assignment across the entire S3MEGGIC population. In the x-axis, 12 eggplant chromosomes are represented and in the y-axis the average percentage of founders contribution for the 420 S3 individuals. In the legend, the colour code associated with each founder as in Figure 1.
genomic region associated with FA and PUC (FA1 and PUC1) on chromosome 1 (5.11-5.88 Mb), another candidate gene was identified similar to MYB113 (SMEL_001g120500.1). Variants that predicted high impact effects on protein function were annotated by SnpEff for both MYB113 genes in the population founders (A, C, and F) that do not present anthocyanins in plants and fruits, as confirmed by aligning the founder gene sequences (Figures 5J,K). Specifically, for the founders A and C, single frameshift variants were identified in two different positions while a disruptive inframe deletion and a slice region variant were identified for the founder F in a third region of the eggplant gene SMEL_010g351850.1 (Figure 5J). For SMEL_001g120500.1,  founders A and F exhibited the exact same splice donor, and the intron variant predicted as high impact, while in the founder C, a splice acceptor and intron variant were identified ( Figure 5K). Founder reconstruction of the S3 individuals and haplotype blocks were estimated for both candidate gene regions (chromosome 10: 91.08-94.81 Mb; chromosome 1: 5.11-5.88 Mb) (Supplementary Figures 3A,B). Founders that contribute most to anthocyanin-related traits in these regions are IVIA-371 (111 and 97 S3 individuals, respectively) and ASI-S-1 (74 and 75 S3 individuals, respectively). Furthermore, a reciprocal best hit BLAST analysis of the SMEL_001g120500.1 onto the tomato genome indicated that this gene corresponded to the orthologue SlANT1-like with 70.99% identity. The eggplant SMEL_010g351850.1 corresponded to the tomato orthologue SlAN2-like with a 73.73% identity, which has been described as the best candidate gene for the anthocyanin fruit biosynthesis in tomato (Yan et al., 2020). These results suggest that a single duplication event and a translocation of a fragment of at least 357 kb from chromosome 10 to 1 ocurred during eggplant evolution (Figure 6).
A candidate gene that corresponds to a COP1 (SMEL_010g339180.1), a gene that encodes for photo-regulatory proteins (Jiang et al., 2016;Li et al., 2018;He et al., 2019;Naeem et al., 2019), was reported in the genomic regions associated with PUC (PUC10.2) on chromosome 10 (3.94-4.34 Mb; Supplementary Table 1). High-effect variants were detected in COP1 gene in those founders that are able to synthesise anthocyanin under the calyx (B and G) and in the green wild species S. incanum (C) (Figure 5L). The C founder mutation was confirmed by the F1 hybrid phenotype from the C × D inter-cross, PUC (anthocyanic pigmentation under the calyx) × puc (no anthocyanic pigmentation under the calyx), which showed the anthocyanin fruits with pigmentation under the calyx (Figure 1A). The wild founder presented multiple SNPs compared with the rest that were predicted to cause a frameshift and missed variants, while founders B and G exhibited variants that were predicted to produce stop codons ( Figure 5L). Founders with a higher haplotype representation contributing to PUC in this region (chromosome 10: 3.94-4.34 Mb) are IVIA-371 (113 S3 individuals) and DH_ECAVI (71 S3 individuals) (Supplementary Figure 3C). Furthermore, we found two candidate genes annotated as similar to BHLH, basic helix loop helix protein A (SMEL_009g326640.1), and, similar to SPA3, protein SPA1-RELATED 3 (SMEL_010g338090.1), respectively (data not shown), close to LD blocks in the PA9 and FA9 (between 17,862,090 and 17,872,428 Mb), and PUC10.1 (between 3,128,678 and 3,143,068 Mb) regions. Although they were described as genes related to the anthocyanin biosynthetic pathway (Maier et al., 2013;Li et al., 2018;Shi et al., 2021), no high-effects variants were identified in these genes for any of the founders.

DISCUSSION
Multi-parent advanced generation inter-cross (MAGIC) populations are outstanding genetic materials for identifying gene-trait associations with high resolution (Arrones et al., 2020;Scott et al., 2020). The introduction of multiple founders with an increased genetic and phenotypic diversity, together with the multiple rounds of inter-crossing and selfing, increases the number of accumulated recombinant events and, thus, improves the mapping accuracy (Scott et al., 2020). By introducing a wild relative as a founder parent, the genetic variability in the population increases, which is a key point for QTL identification (Gramazio et al., 2020a). Here, we present the first eggplant S3MEGGIC population of which one of the founders was one accession of the close wild relative S. incanum.
Large population sizes are essential to increase the power and mapping resolution in MAGIC populations (Collard et al., 2005;Valdar et al., 2006;Jaganathan et al., 2020). Following a simple "funnel" scheme design, the population was kept as large as possible to gather a large number of recombination events. However, a sharp reduction in the number of progenies was observed at the S0 generation, which might be related to the use of the wild species S. incanum (C) as a female founder and parent to obtain the simple (CD) and the double (ABCD) hybrids. This interspecific crossing dragged the maternal cytoplasmic background of the wild parent, which might have caused a partial sterility and bias in subsequent generations. Some studies confirmed a strong effect of wild Solanum cytoplasms in the reduction in pollen fertility of alloplasmic lines (Khan et al., 2020;Isshiki et al., 2021). However, the PCA highlighted the absence of a population structure, also confirmed by the lack of genetically differentiated cytoplasmic groups.
The genotyping of the S3MEGGIC population was carried out with the 5k probes eggplant SPET platform with a welldistributed marker density along all chromosomes (Barchi et al., 2019a). This genotyping strategy has already been used in the analysis of biparental populations (Herrero et al., 2020).
In this study, we have verified that its use can be extended to multiparent populations. The genotyping revealed a low heterozygosity in the S3MEGGIC individuals, similar to the expected value for an F5-like biparental inter-cross generation (6.25%). The contributions of each of the founder parents to the S3MEGGIC population revealed that some parents had a higher representation than others. Apart from drift effects, several biological reasons could potentially explain cryptic selection processes, causing the unbalanced representation of the genomes (Rockman and Kruglyak, 2008;Thépot et al., 2015), including seed dormancy, delayed germination, precocity, reduced fertility, and parthenocarpy associated to some genomes which have already been reported in eggplant and other crops (Barchi et al., 2010;Khan et al., 2015;Prohens et al., 2017). The rather limited contribution of the wild species S. incanum to the final S3 MAGIC individuals may have been caused by a selection pressure, as progenies bred from crosses involving the two different species tend to suffer from a reduced fertility and show a segregation distortion (Lefebvre et al., 2002;Barchi et al., 2010). In addition, S. incanum has a recalcitrant germination and a very erratic flowering and fruit set, which strongly depends on environmental conditions (Gisbert et al., 2011;Mangino et al., 2020Mangino et al., , 2021. Other reasons for the segregation distortion could be the inability of the current genotyping density to efficiently distinguish between the founders that are genetically closer, like AN-S-26 and H15 genotypes. This phenomenon has already been observed in previous MAGIC populations (Dell'Acqua et al., 2015). A deeper genotyping, resulting in a better haplotype reconstruction, might shed light on the mechanisms that have led to the unbalanced representation of the founder genomes in the S3MEGGIC population. Due to this low contribution, care should be taken in the future when analysing the traits that are present only in these founders.
The anthocyanin biosynthetic pathway is one of the most studied biochemical routes in plants due to its physiological importance. Major structural genes of this pathway are under the control of a regulatory complex, where myeloblastosis (MYB) TFs are recognised as main regulators alone or in complexes with other TFs (Ramsay and Glover, 2005;Kiferle et al., 2015;Liu et al., 2018). Activator or repressor MYB proteins directly and competitively bind the basic-helix-loop-helix (bHLH) via the amino terminus domain and can act as positive or negative transcriptional regulators in a tissue-specific mode to modulate the anthocyanin synthesis (Barchi et al., 2019b;Moglia et al., 2020).
In eggplant, anthocyanin-related MYB protein-encoding genes have been reported to be related to fruit peel colouration Docimo et al., 2016;Xiao et al., 2018;Moglia et al., 2020;Toppino et al., 2020). In tomato, a cluster of four different MYB proteins was reported to be involved in the anthocyanin synthesis located on chromosome 10 and encoded by SlAN2, SlANT1, SlANT1like, and SlAN2-like genes (Solyc10g086250, Solyc10g086260, Solyc10g086270, and Solyc10g086290, respectively). However, genetic associations to only two paralogue MYB113 genes were detected in the S3MEGGIC population on chromosomes 1 and 10 (SMEL_001g120500.1 and SMEL_010g351850.1). The same occurs in potato, in which, only two anthocyanin genes have been identified, i.e., Sotub10g028550 and Sotub10g028540, both located on chromosome 10. Similarly, in pepper, in addition to CA10g11690 and CA10g11650, a third gene (CA10g11710) has been identified as an orthologue to SlAN2-like (Barchi et al., 2019b). In eggplant, these two tomato orthologs were previously described as SmelANT1 and SmelAN2 (Docimo et al., 2016;Barchi et al., 2019b), corresponding to SlANT1 and SlAN2. However, the reciprocal best hit BLAST analysis of the eggplant coding proteins showed a stronger homology, respectively, to tomato SlANT1-like and SlAN2-like. Although previous studies in eggplant indicated that the overexpression of SmelANT1 accounts for constitutive upregulation of most anthocyanin biosynthetic genes Shi et al., 2021), we considered the SmelAN2 as the best candidate gene for different reasons. Previous studies in eggplant have located the major anthocyanin-related QTLs on chromosome 10 ( Barchi et al., 2012;Cericola et al., 2014;Toppino et al., 2016), which are in agreement with the highest association signals for anthocyanin-related traits in our GWAS results. Furthermore, recent studies in tomato suggested that SlAN2-like functions as an activator to regulate biosynthesis genes, including SlANT1-like, and controls the accumulation of anthocyanins (Yan et al., 2020). However, in the S3MEGGIC population, high-impact variants on protein function were found in both MYB113 genes for non-anthocyanin fruits. These results could suggest that a duplication of function occurred during an eggplant evolution, and both genes may be required for anthocyanin synthesis.
Although it has been demonstrated that the activation of the anthocyanin biosynthetic pathway in eggplant is strongly regulated by light Xiao et al., 2018), the PUC mutation confers to some genotypes the ability to synthesise anthocyanins under the calyx regardless of incidence of light (Tigchelaar et al., 1968). In this study, a candidate gene, related with light-dependent anthocyanin biosynthesis in fruits, was detected at the beginning of chromosome 10. The constitutive photomorphogenic1 (COP1) gene has been reported to be a regulatory TF responsible for mediating a light-regulated gene expression and development (Jiang et al., 2016;Li et al., 2018;He et al., 2019;Naeem et al., 2019). The COP1 gene has been demonstrated to act as a light-inactivable repressor interacting with MYB TFs (Jiang et al., 2016), and is considered a "molecular switch" in metabolic processes, which are stimulated by light. In presence of light, COP1 expression is inhibited, and its concentration decreases rapidly, promoting anthocyanins synthesis. Under dark conditions, COP1 expression is induced and promotes the degradation of the photomorphogenesis-promoting TF and MYB inhibition by conforming the COP1/Suppressor of phya-105 (SPA) ubiquitin ligase complex (Maier et al., 2013;Jiang et al., 2016;Li et al., 2018). Although PUC can only be observed in anthocyanic fruits, high-impact variants were also identified in the wild species S. incanum COP1 gene (founder C, green fruit). The PUC allele is more common in eggplants from western countries, where European markets demand for more homogeneously pigmentation eggplants. On the other hand, in Asian eggplants, it is more common to find the puc phenotype, such as in the ASI-S-1 founder.

CONCLUSION
In conclusion, the S3MEGGIC population represents a landmark breeding material and a tool of great value, which allows the study and fine-mapping of complex traits due to (i) the highly phenotypically diverse founders; (ii) the large population size, being the largest developed eggplant experimental population so far; (iii) the high degree of homozygosity of the final individuals, which constitute a population of fixed "immortal" lines nearly homozygous at each locus; and (iv) the tailored genotyping SPET platform used for the genetic analysis of the population, which has been developed from the whole genome sequencing (WGS) of the founders and allows the comparison with the genotyped materials with the same setup (Gramazio et al., 2020b). In addition, the S3MEGGIC population has demonstrated its potential usefulness for association studies, allowing the establishment of marker associations to anthocyanin-related genes and identification of candidate genes for plant and fruit anthocyanins, including the first identification of a candidate gene for an economically relevant breeding trait in eggplant such as PUC.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/, PRJNA392603.

AUTHOR CONTRIBUTIONS
SV, PG, and JP conceived the idea and supervised the manuscript. GM, AA, MP, and PG performed the field trials. GM and AA prepared a first draft of the manuscript. All other authors reviewed and edited the manuscript.

FUNDING
This work was supported by the Ministerio de Ciencia, Innovación y Universidades, Agencia Estatal de Investigación and Fondo Europeo de Desarrollo Regional (grant RTI2018-094592-B-I00 from MCIU/AEI/FEDER, UE) and European Union's Horizon, 2020 Research and Innovation Programme under grant agreement no. 677379 (G2P-SOL project: Linking genetic resources, genomes and phenotypes of Solanaceous crops). AA is grateful to Spanish Ministerio de Ciencia, Innovación y Universidades for a pre-doctoral (FPU18/01742) contract. MP is grateful to Spanish Ministerio de Ciencia e Innovación for a post-doctoral grant (IJC2019-039091-I/AEI/10.13039/501100011033). PG is grateful to Spanish

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2022. 847789/full#supplementary-material Supplementary Figure 1 | (A) Chain pollination scheme of the four-way hybrids followed to obtain the eight-way hybrids. (B) For each S0 progeny, four plants were germinated, selecting for the next generation (S1) only the first two that set fruits with viable seed. For subsequent generations, two plants were germinated and only the first one that set fruit was selected for the next generation.