Cultivated Olive Diversification at Local and Regional Scales: Evidence From the Genetic Characterization of French Genetic Resources

Molecular characterization of crop genetic resources is a powerful approach to elucidate the origin of varieties and facilitate local cultivar management. Here we aimed to decipher the origin and diversification of French local olive germplasm. The 113 olive accessions of the ex situ collection of Porquerolles were characterized with 20 nuclear microsatellites plus their plastid haplotype. We then compared this collection to Mediterranean olive varieties from the Worldwide Olive Germplasm Bank of Marrakech, Morocco. High genetic diversity was observed within local French varieties, indicating a high admixture level, with an almost equal contribution from the three main Mediterranean gene pools. Nearly identical and closely related genotypes were observed among French and Italian/Spanish varieties. A high number of parent–offspring relationships were also detected among French varieties and between French and two Italian varieties (‘Frantoio’ and ‘Moraiolo’) and the Spanish variety (‘Gordal Sevillana’). Our investigations indicated that French olive germplasm resulted from the diffusion of material from multiple origins followed by diversification based on parentage relationships between varieties. We strongly suggest that farmers have been actively selecting olives based on local French varieties. French olive agroecosystems more affected by unexpected frosts than southernmost regions could also be seen as incubators and as a bridge between Italy and Spain that has enhanced varietal olive diversification.


INTRODUCTION
Olive (Olea europaea L.) is the iconic fruit crop of the Mediterranean Basin. Archaeological, historical, and genetic studies support a primary olive domestication in the Near East, probably starting during the Chalcolithic period (Kaniewski et al., 2012;Zohary et al., 2012;Besnard et al., 2013b). Then long-distance translocation of varieties followed by admixture events led to secondary multi-local diversification in central and western Mediterranean regions (Terral, 1997;Besnard et al., 2001a;Belaj et al., 2002;Owen et al., 2005;Baldoni et al., 2006;Breton et al., 2008;Diez et al., 2015). Locally adapted varieties have thus been carefully selected by farmers in several Mediterranean areas and the domestication process is still ongoing (Besnard et al., 2001a;Khadari et al., 2003;Khadari et al., 2008;El Bakkali et al., 2013a;Besnard et al., 2018). Selected trees are still both clonally and seed propagated in traditional agroecosystems from different parts of the Mediterranean Basin (Aumeeruddy- Thomas et al., 2017;Besnard et al., 2018), implying a continuing role of sexual reproduction in varietal diversification, with potential contributions from local domesticated, feral, and wild olives. Due to this diversification process, a high frequency of parentage relationships could be expected among varieties, as previously observed within the Spanish olive germplasm  and in grapevine (Bowers et al., 1999;Lacombe et al., 2013). Furthermore, farmer selection of newly adapted olive trees could be viewed as a key process in agroecosystems under changing climatic and ecological conditions, such as those on the fringe of olive growing areas. But it is still unknown how farmer selection and varietal diversity relate to these changing environmental conditions. Today, in a context of global changes associated with the emergence of pests that threaten olive cropping, especially in southern Europe (e.g. Xylella fastidiosa), there is call for the selection of varieties adapted to new environmental conditions (De Ollas et al., 2019). The characterization of olive varieties in any germplasm bank and the elucidation of their origins are thus high priority to ensure efficient use of genetic resources in the future.
In France, olive is traditionally cultivated in southern regions on the rim of the Mediterranean Sea. Major development of olive cultivation was initiated by Phoenicians in Massalia, i.e. present day Marseille, around 2600 BP, while wild olives were already present and some local varieties were also likely cultivated (Terral et al., 2004). More than 100 French olive varieties are currently described based on morphological descriptors Moutier et al., 2011) and molecular markers (Khadari et al., 2003;Khadari et al., 2004). A few of them are considered as main varieties since they are cultivated over relatively large geographical areas, while more than 80 have a restricted distribution range, generally spanning a few townships. This particularly high diversity at the northern limit of the cultivated olive range may partly be the result of recurrent farmer selection of adapted varieties due to relatively frequent frosts that affect local olive germplasm. A significant portion of present varieties (14%) show a maternal origin from the western Mediterranean region, suggesting a local origin (Besnard et al., 2001a;Khadari et al., 2003). Previous studies based on nuclear genetic markers further supported an admixed origin for most French varieties with a prevalent genetic contribution from the eastern Mediterranean (Besnard et al., 2001a;Haouane et al., 2011), as similarly shown in Italian and Tunisian germplasm, respectively on the northern and southern shores of the Mediterranean Sea (Haouane et al., 2011;Belaj et al., 2012;Khadari and El Bakkali, 2018). Such an admixed origin could be seen as a genetic signature of local olive diversification in the central Mediterranean area. However, the local crop diversification process remains unclear, and may involve major progenitors, as shown, for instance, in Andalusian olives . In addition, it was also shown that cultivars growing in the eastern and western sides of the Rhone valley were differentiated (Khadari et al., 2003), possibly reflecting two pathways of olive cultivar introduction from the Italian and Iberian Peninsulas, respectively.
In the present study, we investigated the cultivated olive diversification process in southern France, with the aim of determining ways to efficiently manage local olive genetic resources. Both nuclear and chloroplast loci were used to characterize the genetic diversity of a set of varieties from the French Olive Germplasm Bank (FOGB) in comparison to the Worldwide Olive Germplasm Bank (WOGB) of Marrakech, Morocco (Haouane et al., 2011;El Bakkali et al., 2013b). We specifically aimed to: (1) assess genetic diversity within the FOGB collection and propose a nested set of French reference varieties representative of total genetic diversity; (2) compare the genetic diversity at two different geographical scales, i.e. local (France) and regional (Mediterranean area); and (3) clarify the origin of French olive germplasm by parentage analyses within and among French and Mediterranean varieties. Our results were examined in light of the diversification process founded on farmer selection within traditional agroecosystems probably hampered by frequent climatic accidents such as frost.

Plant Material
The FOGB includes a total of 113 olive accessions, and is maintained on the island of Porquerolles, near Toulon in southern France (Table 1). These accessions are identified with a variety name and/or with tree coordinates in the collection (Table 1). Among the 63 accessions identified with a variety name, 14 are considered as being the main French varieties since they are cropped over broad areas compared to minor varieties (22), which have a limited distribution range, generally over a few townships, and to local varieties (27), which are only present in one or two orchards ( Table 1; Moutier et al., 2004;Moutier et al., 2011).
Genotypes of French accessions were compared to those of other varieties collected throughout the Mediterranean Basin. Four hundred and sixteen accessions from 13 Mediterranean countries that are maintained in the World Olive Germplasm Bank of Marrakech (WOGB; Supplementary Table S1) were analyzed. Mediterranean varieties conserved in the WOGB collection are classified in three gene pools based on both the country origin and genetic structure, i.e. East (mostly from Cyprus, Egypt, Lebanon, and Syria), West (mostly from Morocco, Spain, and Portugal), and Central (mostly from Algeria, Italy, Slovenia, Croatia, Tunisia, and Greece; Haouane et al., 2011;El Bakkali et al., 2013b).

Datasets
Twenty microsatellite nuclear loci (SSR) were used for genotyping accessions of both FOGB and WOGB (Table 2), as described by El . These markers were selected based on their clear amplification, high polymorphism, and reproducibility, as reported by Trujillo et al. (2014). Alleles were carefully scored twice independently by two researchers.  Genotyping of accessions with a specific allele (i.e. observed only once) was systematically repeated to ensure its occurrence. Plastid DNA (cpDNA) variations were also characterized using 39 markers, including 32 cpSSR loci, five indels (insertions/ deletions), and two single nucleotide polymorphisms (SNPs), as described by Besnard et al. (2011).

DaTa aNaLYSIS Genetic Diversity and Structure
The number of alleles per locus (Na), expected (He; Nei, 1987) and observed heterozygosity (Ho), and polymorphism information content (PIC) were estimated using the Excel Microsatellite Toolkit v.3.1 (Park, 2001). A binary matrix containing only distinct French genotypes was built, using alleles scored as present (1) or absent (0) to assess genetic relationships within the FOGB collection. This matrix was used to construct a dendrogram based on Dice's similarity index (Dice, 1945) and the UPGMA algorithm with the NTSYS v2.02 software package (Rohlf, 1998).
The French (FOGB) and Mediterranean (WOGB) collections were compared based on different criteria: (1) genetic parameters such as the allele number (Na), expected and observed heterozygosity (He and Ho); (2) the distribution of pairwise genetic distances between cultivars using the index of Smouse and Peakall (1999) in GENALEX 6 program (Peakall and Smouse, 2006); (3) the allelic richness (Ar) using the ADZE program (Szpiech et al., 2008); (4) a principal coordinate analysis (PCoA) implemented in DARWIN 5.0.137 (Perrier et al., 2003) using the simple matching coefficient to describe relationships between genotypes based on the spatial distribution of the two first coordinate axes; and (5) the genetic structure within both collections using the model-based Bayesian clustering approach implemented in STRUCTURE v.2.2 (Pritchard et al., 2000) according to the parameters described in Haouane et al. (2011). Regarding the genetic structure, the reliability of the number of clusters (K) was checked using the ad hoc ΔK measure (Evanno et al., 2005) with the R program, whereas the similarity index between different replicates for the same K clusters (H′) was calculated using the CLUMPP v1.12 program (Jakobsson and Rosenberg, 2007).

Parentage analysis
Parentage analyses were based on nuclear SSR data and aimed at detecting putative parent-offspring relationships among French varieties, as well as between these latter and varieties from the whole Mediterranean Basin. A putative parent-offspring pair is defined as any pair of individuals that share alleles across all loci and contain all true and false parent-offspring pairs (Jones et al., 2010). Indeed, the probability of two unrelated genotypes sharing alleles by chance at all loci is not trivial, especially for a large set of pairwise comparisons with a limited number of molecular markers. A key challenge addressed in our analyses was to correctly identify the true parent-offspring pairs within a dataset, while simultaneously excluding pairs that could potentially have shared alleles by chance. Considering the large panel of examined varieties without any available information on parentage relationships, pedigree reconstruction based on parental pair assignment may be not robust, as in cases when one parent is already known (Jones et al., 2010), and the probability of detecting false parent-offspring pairs would thus need to be assessed. Here, in a first step, we conducted parentage analyses through a "singleparent search" (Jones et al., 2010) in order to identify putative parent-offspring pairs. Second, based on these results, we used parental pair assignment to construct pedigree among varieties. For single-parent searches, we used a complete exclusion approach and two parentage assignment approaches where by the single most likely parent was chosen from a group of nonexcluded candidate parents based on a likelihood method or on Bayesian posterior probability of a: (i) First, we used the exclusionbased method with the PARFEX v.1.0 macro (Sekino and Kakehi, 2012). This simple method examines genotype incompatibilities between offspring and parents based on Mendelian inheritance rules. A parentage relationship is established if a single parent of offspring remains non-excluded from a parental pool considering 0 or 1 mismatching allele at a single locus; (ii) we then used the likelihood-based method (Gerber et al., 2000) available in the PARFEX v.1.0 macro. This parentage inference relies on the difference in the log-likelihood ratio (LOD) between related and unrelated relationships. To define a threshold (LODc) to accept/reject possible parentage relationships (single parent), offspring were simulated using the allelic frequencies (L obs ) observed in our datasets and a random sampling of alleles (L rand ), while taking into account the genotypic error rate for random replacement of simulated genotypes at each marker (e sim ) and for LOD calculations (e calc ). Simulations were conducted using 1% error rates for e calc and e sim , 200 parents, and 10,000 offspring. The LODc was defined by the intersection of the distribution of L obs and L rand ; (iii) Lastly, based on the exclusion-Bayes' theorem method (Christie, 2010) using SOLOMON package in the R program (Christie et al., 2013), the posterior probability of false parent-offspring pairs (among all pairs that share at least one allele across all loci) was assessed in a dataset to determine whether all putative parent-offspring pairs could be accepted with strict exclusion. The probability of observing shared alleles between unrelated individuals was calculated using 1,000 simulated datasets and 50,000,000 simulated genotypes. Finally, parentage inferences of each French genotype were considered as reliable when validated by the three approaches. By detecting single parent-offspring relationships, the identity of parents and offspring of each putative pair could not be determined. Networks of parent-offspring relationships were plotted with the "igraph" package in R environment (Csardi and Nepusz, 2006).
For parental pair assignments, putative parent-offspring relationships detected with the three previous approaches were re-used. We used the likelihood-based method (Gerber et al., 2000) to assess this panel of relationships because it appears to be the most conservative approach compared to the exclusion-Bayes' theorem method (see Results).

Core Collection Sampling
For agronomic experiments and breeding programs, it may be necessary to define sets of cultivars representative of French cultivated olive germplasm. French core collections were thus constructed from the FOGB collection according to the twostep method described by El . Nested core collections were constructed by combining two approaches implemented in the CoreHunter (Thachuk et al., 2009) and Mstrat (Gouesnard et al., 2001) programs. First, an initial core collection capturing total allelic diversity was constructed with Mstrat to estimate the sample size necessary to capture all observed alleles. Then CoreHunter with the "Sh strategy" was run with half of the initial constructed core collection in order to select a primary local core collection with the lowest number of accessions. This primary core collection was used as a kernel in Mstrat to capture the remaining alleles and 50 independent core collections were proposed.

Characterization of French Olive Germplasm and Definition of Reference Genotypes per Variety
One hundred and four distinct genetic profiles were obtained among the 113 accessions of the FOGB based on 20 SSR nuclear loci ( Table 1). Among the 6328 pairwise comparisons, 10 were identical (0.16%), 13 (0.19%) were closely related and differing by one or two dissimilar alleles, whereas the remaining pairs were distinguished by three to 37 dissimilar alleles ( Figure 1A). Closely related SSR profiles with one or two dissimilar alleles were considered as putative molecular variants resulting from somatic mutations and were thus classified as a single genotype. This was the case for ancient varieties such as 'Boube' or 'Négrette' and also for major varieties, such as ' Aglandau' or 'Cailletier' , which are cultivated over broad geographic areas (Supplementary Table S2). The SSR profile considered as the reference genotype of the variety was chosen based on the high frequency of trees under the same molecular profile ( Table 1 and Supplementary  Table S2). Hence, a total of 92 genotypes was defined among the 113 accessions analyzed and the most closely related pairs were ultimately distinguished by five dissimilar alleles; e.g.   (Figures 1B and 2; Table 1).
According to the methodology proposed by Khadari et al. (2003), a total of 63 varieties were validated as reference varieties by checking the morphological traits of olive stones and SSR profiles of several trees originating from different nurseries and orchards (Table 1). For instance, six trees of the 'Cailletier' variety from distinct origins were analyzed to define the reference genotype . Similarly, a total of 15 and 18 trees from different nurseries and orchards were analyzed to validate the reference genotypes of the 'Petit Ribier' and 'Négrette' varieties, respectively Moutier et al., 2011). The remaining 30 accessions, classified by tree coordinates in the germplasm collection, are currently being validated to determine the reference genotype of each variety according to the methodology described here (Table 1).

Nuclear and Plastid DNa Polymorphism
Considering the 92 genotypes of the FOGB, a total of 191 alleles were revealed with an average of 9.55 alleles/locus ( Table 2). Among the 191 alleles detected, 42 (22%) were observed once. For each SSR locus, PIC values ranged from 0.296 at the GAPU71A locus to 0.856 at the DCA04 locus (mean 0.688). Only three out of the 20 loci used were able to discriminate between the 92 genotypes revealed among the 113 accessions analyzed, i.e. DCA04, DCA09, and GAPU101 (Supplementary Table S3).
The use of 39 chloroplastic loci revealed the presence of six chlorotypes in the French olive germplasm. As expected (see Besnard et al., 2013b), the most frequent chlorotype was E1.1 (79.4%). One of the five other haplotypes was detected once, i.e. E3.3 in the accession referred to as '36-28' (Table 1; Figure 2).

Characterization and Pairwise Comparison Between the Two Germplasm Collections
Based on pairwise analysis of the WOGB with 20 nuclear loci, 404 single SSR profiles (min. 1 dissimilar allele) were identified among the 416 Mediterranean olive accessions. Among the 86320 pairwise comparisons, 36 were identical (0.04%), 166 (0.19%) were closely related (differing by one or two dissimilar alleles), whereas the remaining were distinguished by 3 to 40 dissimilar alleles ( Figure 1C). Similar to the FOGB collection (see above), accessions showing identical profiles and those with one or two dissimilar alleles (molecular variants) were considered as belonging to the same genotype, leading to a total of 311 distinct genotypes among the 416 accessions analyzed (Supplementary Table S1).
Pairwise comparisons between the two collections revealed that eight French accessions were identical or closely related to 28 Mediterranean varieties ( Table 3). Eighteen out of the 28 varieties originated from Italy, four from Lebanon, whereas the six remaining varieties were from Algeria (2), Spain (1), Cyprus (1), Greece (1), and Morocco (1).

SSR Polymorphism and Genetic Diversity
The 92 genotypes identified in the FOGB collection were used for comparison with the distinct WOGB genotypes. Among the 191 alleles revealed in the FOGB collection, 187 were present in the WOGB genotypes (339 alleles; Table 2). Only four alleles were detected in the French germplasm (  Smouse and Peakall (1999)] was observed in both FOGB and WOGB (Figure 3): ranging from 3 to 55 (with a mean of 29.01) in WGOB, and from 3 to 49 (mean of 27.36) in FOGB.

Genetic Structure
Admixture model-based Bayesian clustering was performed on both datasets, with a total of 395 distinct genotypes from both collections. According to ΔK and H′, K = 3 was the most probable genetic structure model (ΔK = 554.11 and H′ = 0.998; Figure 4 and Supplementary Figure S1). Among the 92 French genotypes, 15, 8, and 1 were assigned, with a membership probability of Q ≥ 0.80, to East, Central, and West gene pools, respectively; whereas, 68 (73.9%) genotypes were assigned to more than one group, with Q < 0.80 (Table 1 and Supplementary Table S6).
A principal coordinate analysis (PCoA) was conducted and the findings were plotted according to genetic groups, as identified by the program. The first two principal axes explained 10.46% of the total genetic variance (Figure 5). French cultivars were classified within the main total diversity range observed in WOGB. The majority of French genotypes were classified in the Mosaic Mediterranean group (Q < 0.80; Supplementary Table S6).

Parentage Relationships Between French and Mediterranean Olive Cultivars
Relationship analyses were conducted using genotypes from the FOGB and WOGB collections with more than two dissimilar alleles. The eight genotypes of the WOGB detected to be identical or genetically close to those of FOGB were also excluded ( Table 3). Finally, 92 and 303 genotypes from FOGB and WOGB, respectively, were included in the analyses (Table 1 and Supplementary Table S1).
Using the log-likelihood ratio (LOD) method, the LODc was estimated as the intersection between L rand and L obs . A threshold LOD at 4.22 allowed us to define the success rate in detecting true parent-offspring relationships at 97.7% (Figure 6A). We thus applied this value in parentage testing for the observed data. Otherwise, the exclusion-Bayes' theorem method indicated a posterior probability that any pair of genotypes shared at least one allele across all loci (no mismatching across all loci) by chance is Pr(Phi) = 0.00319, while it was 0.03547 for a false parent-offspring pair with a mismatch at one locus (Figure 7). Since we could not exclude the possibility that there might have been a few errors in our dataset (including somatic mutations and null alleles), we used the threshold <0.03547 as a cutoff for identifying putative parents.
The putative parent-offspring relationships with the highest probability were observed with the exclusion method (431 parent-offspring pairs), while the Bayesian and the LOD methods gave rise to the lowest number (368 and 239, respectively; Supplementary Table S5). The number of French genotypes with a putative parent-offspring relationship differed between methods: 81 genotypes for the Bayesian-based method, 75 for the exclusion method, and 68 for the LOD method.
For the French varieties, a total of 193 putative parentoffspring pairs were identified when validated by the three approaches. Among these, 101 were detected within the French germplasm since 51 French genotypes (55.4%) were found to have reliable parentage relationships with French varieties only ( Table 1 and Supplementary Table S5). Two French varieties showed a particularly high number of putative parent-offspring relationships, i.e. 'Boube' and 'Cailletier' , with 36 and 24, respectively (Figure 8), but most of their parentage relationships were established with non-French varieties (30 and 18 putative parent-offspring pairs for 'Boube' and 'Cailletier' , respectively). For other French varieties, the number of putative parentoffspring relationships varied from one to six within the French germplasm, and from one to four between French and other Mediterranean varieties (Table 1 and Supplementary Table S5).
Most parent-offspring relationships identified belonged to the same genetic group (Figure 9 and Supplementary Figure  S2; Table 4). Varieties from 11 countries, except Cyprus, Egypt, and Lebanon, showed at least one putative parent-offspring relationship with a French variety. When comparing the origins of these cultivars, we found that varieties from France, Italy, and Spain had the highest proportion of parentage relationships ( Table 4). Out of 115 Mediterranean cultivars, 51 (44.3%), 28 (24.3%), and 18 (15.6%) varieties, respectively, came from France, Italy, and Spain (Table 4). These results underline the importance of parentage relationships within the French germplasm and between French, Spanish, and Italian varieties. Based on the 193 putative parent-offspring pairs, parentage relationships were examined by searching parental pairs with the likelihood approach. All varieties having at least one putative parent-offspring, as validated by at least one approach among the three used for a single parent search, were analyzed, including 86 French genotypes and 155 other Mediterranean varieties (Supplementary Table S5). A threshold LOD at 13.7 allowed us to define the success rate at 99.7% in detecting the most likely parental pair of the offspring based on the highest LOD score ( Figure 6B). The French 'Boube' variety and the Spanish 'Lechin de Granada' variety were identified as the most likely parental pair for six Spanish varieties (Table 5; Figure 10): 'Negrillo de Iznalloz' was assigned with the highest LOD pp value (23.6) and no allele mismatch, while the remaining most likely offspring were assigned at a LODpp ranging from 13.81 to 15.76, with an allele mismatch at one locus (Table 5). Moreover, the 'Boube' variety was identified as one of the most likely parents of the French '36_25' genotype, with a LODpp at 15.69 and one mismatch at locus DCA16 (Table 5), while the 'Lechin de Granada' variety was identified as the most likely parent of the 'Sevillano de Jumilla' variety, with no allele mismatch. The 'Boube' variety harbors the E1-2 maternal haplotype, and thus could not be the mother of the seven identified offspring that shared the E1-1 maternal haplotype. Surprisingly, we detected only one pair of parents involving the 'Cailletier' variety (

Sampling Varieties to Represent French Olive Genetic Diversity
A core collection was defined according to the two-step method proposed by El   (Supplementary Figure S3). Forty-three genotypes (46.7%) were necessary to capture the 191 alleles in the FOGB collection. Based on half of the initial sample size of 43 (23.9%), a primary core collection of 22 genotypes was constructed (CC 22 ; Supplementary Table S7). The 22 entries thus allowed the capture of 169 alleles (88.5%), three maternal haplotypes (18 E1-1, 1 E2-1, and 3 E3-1; 50%), and 17 reference varieties (Supplementary Table S8). This primary core collection (CC 22 ) was used as a kernel with Mstrat to capture the remaining alleles. Hence, 43 entries (CC 43 ; 46.7%) were sufficient to capture the total diversity. 50 sets of 43 French varieties were generated using Mstrat as the CC 43 (Supplementary Table S7).
No differences were observed in the expected heterozygosity (He; Nei, 1987) in 50 independent runs. In addition to the 22 varieties used as a kernel, 18 varieties were found to be common in all of the 50 independent runs, while a combination of three    Table S8). Most of the varieties sampled in CC 43 showed high admixture since 33 varieties (76.7%) belonged to more than one gene pool, while only six and four genotypes were assigned (with membership probabilities of Q ≥ 0.80) to central and eastern gene pools, respectively (Supplementary Table S8). Genotypes selected for the primary core collection (CC 22 ) and for the core collection capturing all alleles (CC 43 ) had the lowest frequency of parentage relationships (8% and 22%, respectively; Supplementary Table S8). This pattern is in line with the findings obtained with the approach used to construct the core collection favoring genotypes without genetic relatedness. FIGURE 4 | The most probable genetic structure model using the program at K = 3 for 395 distinct genotypes from both collections. H′ represents the similarity coefficient between runs for each K, and ΔK represents the ad hoc measure of Evanno et al. (2005).

DISCUSSION
Our study first allowed us to generate a database for efficient identification of French varieties. We took advantage of this genetic characterization to investigate French olive genetic diversity and assess the importance of local genetic resources and their associated agroecosystems in the cultivated olive tree diversification process.

Diversification of French Olive Germplasm by admixture
Primary selection and secondary diversification are two key processes in the history of olive domestication (Khadari and El Bakkali, 2018). Diversification can be viewed as a process that is driven mainly by farmer selection of trees harboring interesting traits. As this selection occurs within the agroecosystem, selected trees are most likely derived from crosses between varieties or previously selected clones, sometimes with pollen coming from feral or wild olive trees (for review, see Gaut et al., 2015;Besnard et al., 2018). Sociohistorical and ethnobiological investigations of traditional olive agroecosystems in northern Morocco have highlighted strong links between selected trees from clonally and seed propagated trees, indicating the continuing roles of cultivated, feral, and wild olive trees in the diversification process (Aumeeruddy- Thomas et al., 2017). Here we also showed an admixed origin of French varieties, suggesting a diversification process involving local and introduced genetic resources. Among the 92 French genotypes, 68 (73.9%) were admixed as they were assigned to more than one group with Q < 0.80. Most of them (82.6%) harbored the eastern maternal lineage [i.e. haplotypes E1-1 (73) and E1-2 (3)] originating from the eastern Mediterranean Basin, and it was introduced in the westernmost regions via the diffusion of oleiculture (Besnard et al., 2013b).
As previously suggested by several authors (Besnard et al., 2001a;Baldoni et al., 2006;Belaj et al., 2007;Breton et al., 2008;Besnard et al., 2013a;Diez et al., 2015), in our following arguments we assumed that the French olive germplasm was mainly derived from a diversification process involving local genetic resources, in addition to the introduction of cultivated olives belonging mainly to the Q2 genepool (central Mediterranean), as well as the Q1 genepool (western Mediterranean). First, we observed a clear genetic pattern derived from admixture germplasm from the central Mediterranean area, including French local genetic resources, as previously reported by Haouane et al. (2011);Diez et al. (2012), andEl Bakkali et al. (2013a). Second, these local genetic resources harbored a maternal lineage from the eastern primary domestication center (Besnard et al., 2013b). Third, despite the reduction in allelic diversity (22.4%) as compared to Mediterranean cultivated olive, the French germplasm showed a similar expected heterozygosity and pairwise genetic distance pattern compared to Mediterranean olive germplasm, indicating that admixture was likely a consequence of this pattern. Fourth, we highlighted that approximately half of the ex situ collection of Porquerolles (46.7%) was necessary to capture all of the French diversity, which was mainly classified in the mosaic Mediterranean group. Finally, we observed substantial parentage relationships (parent-offspring) at a local scale within  French varieties and at a regional scale between French, Italian, or Spanish varieties, indicating that selection from crossing between varieties was likely a key varietal diversification process within French agroecosystems and neighboring regions.

French agroecosystems as a Bridge Between Italy and Spain for Olive Diversification
Identical and nearly identical genotypes were identified among French and other Mediterranean germplasm. This is evidence in favor of the translocation of varieties between distant regions (e.g. Besnard et al., 2001b;Haouane et al., 2011;Trujillo et al., 2014). Interestingly, we report for the first time the high genetic similarity between 'Cailletier' , a major French variety, and the Italian 'Frantoio' variety. These two genotypes were here distinguished by only one allele on the reference genotype of each variety and may have represented distinct clones propagated from a single genotype (e.g. due to clonal selection; Bellini et al., 2008). A similar pattern was noted for the French 'Petit Ribier' variety and the Italian 'Moraiolo' , variety as previously observed by Pinatel (2015) in a study using morphological descriptors.   (Besnard et al., 2001b;Bronzini de Caraffa et al., 2002). Otherwise, the local French 'Boube' variety was found to be genetically similar to that of the oldest Spanish variety, i.e. 'Gordal Sevillana' , and it was the only French variety clearly assigned to the western genepool (Q1). Pinatel (2015)   considered 'Boube' as a local variety (only present in three distant orchards in Alpes de Haute Provence, with a single centennial tree per orchard) that was probably cultivated in southern France in ancient times. Here our results suggest that the variety was probably introduced from the Iberian Peninsula due to the large olive fruit size. Its past importance in the western Mediterranean Basin needs to be reviewed as it was spread over broad areas (at least from Andalusia to southeastern France) and then was involved in varietal diversification, particularly in Spain, but also elsewhere . Beyond the substantial parentage relationships within the French germplasm (53.89%), we clearly identified relationships (parent-offspring) between French and Italian varieties, as well as French and Spanish varieties, mainly based on 'Cailletier'/'Frantoio' and 'Boube'/'Gordal Sevillana' varieties, since they harbored the highest number of putative parent-offspring pairs (24 and 36, respectively). Interestingly, the 'Cailletier'/'Frantoio' variety assigned to the central cluster had robust relationships (parent-offspring) with varieties from Italy which belonged to the same cluster, as well as the 'Petit Ribier'/'Morailo' variety, while the 'Boube' variety displayed parentage relationships from Spanish germplasm. We observed that 'Cailletier'/'Frantoio' and 'Petit Ribier'/'Morailo' were the main progenitors of Italian/ French varieties, while 'Boube'/'Gordal Sevillana' was the main progenitor of Spanish/French varieties. As previously reported by Diez et al. (2015), we confirmed that 'Gordal Sevillana' was one of the main progenitors of Spanish germplasm, but strikingly we found that it was likely the male parent of six French varieties, i.e. 'Clermontaise' and 'Courbeil' , which are cultivated in southwestern area (Hérault and Pyrénées Orientales; Moutier et al., 2004) bordering northeastern Spain (Catalonia). This Spanish variety was considered by Diez et al. (2015) as being one of the main founders of the western genepool (Q1), and based on our results we hypothesize that it was also the founder of part of the French germplasm assigned to the Mosaic genetic group. In addition, we noted for the first time that 'Frantoio' and 'Morailo' were putative progenitors of numerous Italian and French varieties. This result also suggests that these two varieties have been major progenitors within the Central Mediterranean group (Q2).

French Olive agroecosystems as Varietal Diversification Incubators
Surprisingly, despite the limited French olive growing area (southern continental France and Corsica), we identified a high number of varieties in the ex situ collection of Porquerolles, including a panel of at least 30 currently cultivated varieties Moutier et al., 2011). French olive growing is still mainly founded on a traditional system involving a diverse range of crops and varieties (Pinatel, 2015). This could be viewed as a key factor favoring varietal diversity, as previously noted by several authors (Gemas et al., 2004;Khadari et al., 2008;Kaya et al., 2013;Marra et al., 2013;Las Casas et al., 2014;Xanthopoulou et al., 2014). Here, we assumed that the French varietal diversity could mainly be explained by active farmer selection, probably due to the impact of relatively frequent climatic accidents on local germplasm. Indeed, French olives are cultivated along the northern rim of the olive growing area where frost events are frequent-the last major one, in 1956, caused substantial damage in olive orchards, thus negatively impacting the socioeconomic sector (Pinatel, 2015). Moreover, we report for the first time that French genetic resources displayed substantial parentage relationships involving both local and foreign varieties (55 and 60, respectively), with more than half of the parent-offspring pairs occurring in local French germplasm (53.89%; Table 1). A similar pattern was observed in the western group (Q1 cluster), where the average number of first-degree relationships (full siblings or parentoffspring) was 16.25, while it was 2.0 in the Q2 (central Mediterranean) and Q3 (eastern Mediterranean) genetic clusters . Similarly, in grapevine, selection via crossing was previously identified by investigating the parentage relationships of 'Chardonnay' , 'Gamay' , and other wine grapes grown in northeastern France (Bowers et al., 1999). In their extended parentage analysis of the INRA grape germplasm repository (France; 2,344 unique genotypes), Lacombe et al. (2013) identified the full parentage of 828 cultivars including 447 traditional cultivars, which are likely derived from farmer selection in traditional agroecosystems. These processes have been reported in other perennial fruit cropping systems such as apricot, which is seed-propagated in oasis agroecosystems in the southern Maghreb region (Bourguiba et al., 2012;Bourguiba et al., 2013). Since the second half of 20th century, so-called "modern" perennial fruit cropping systems were managed with a single variety using agronomic practices that fostered yield improvement. They gradually replaced traditional agroecosystems which were based on higher diversity of crops and varieties than modern systems. Such diversified agroecosystems may still be found in mountainous areas around the Mediterranean Basin, as described, for instance, by Aumeeruddy-Thomas et al. (2017) in North Morocco. In southern France where olive and grapevine are often cultivated in the same locations, the olive varieties identified in the present study were mainly derived from crosses between local and foreign genetic resources, as we revealed by the parentage analysis.

CONCLUSION
Our results provide a clear picture regarding the importance of farmer selection in the olive varietal diversification process in traditional French agroecosystems. Indeed, we observed substantial parentage relationships within French olive germplasm and the proportion of parent-offspring pairs was still high (45.08% out the 193 putative parentoffspring pairs), even when not considering the Italian 'Frantoio' variety or the Spanish 'Gordal Sevillana' variety. Otherwise, we observed a pattern of parentage relationships from crossing: (i) between French and Spanish varieties within agroecosystems in southwestern France, especially in the Pyrénées Orientales area, and (ii) between French and Italian varieties in the southeastern France, particularly in the Alpes Maritimes area. We thus argue in favor of active farmer selection founded mainly on local French varieties, probably due to frequent climatic accidents such as frost. When examining diversification processes at the regional scale in all southern European countries, we consider that French agroecosystems are incubators for olive diversification and serve as a bridge between Italy and Spain (Khadari et al., 2003), thus highlighting the importance of diversification as one of the two key processes in the history of olive domestication (Khadari and El Bakkali, 2018).

DaTa aVaILaBILITY STaTEMENT
All datasets for this study are included in the article/ Supplementary Material.

aUThOR CONTRIBUTIONS
BK designed the research and wrote the manuscript with AEB and GB. AEB, LE, and CT performed microsatellite genotyping. BK and CP checked the reference list of French varieties. BK and AEB performed the data analysis. BK, AEB, and GB interpreted the data analysis. All co-authors participated in approving the final manuscript.

aCKNOWLEDGMENTS
We thank Sylvia Lochon-Menseau, Bruno Bernazeau, Daniel Bielmann (CBNMed collection Porquerolles), Hayat Zaher, Lhassane Sikaoui, and Abdelmajid Moukhli (WOGB of Marrakech) for their management of ex situ collections; Ronan Rivalon, Pierre Mournet, and Sylvain Santoni for laboratory assistance; Franck Curk for the olive fruit illustration in the figures and for his comments on the final draft of the manuscript.

SUPPLEMENTaRY MaTERIaL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01593/ full#supplementary-material S1 | List of accessions from the WOGB and FOGB collections (529 accessions); 416 and 113, respectively. Accessions are classified according to origin, maternal lineage, inferred ancestry (Q) among clusters at K = 3, and SSR code (code for our genotyping analyses). French cultivars showing relationships with other cultivars are indicated.   TaBLE S5 | List of French genotypes showing putative parent relationships with other genotypes using three approaches; the exclusion (mismatch marker), the log-likelihood ratio (LODp) and the exclusion-Bayes' theorem. Maternal lineage and inferred ancestry (Q) among clusters at K = 3 are indicated for each genotype, as well as results for each parentage approach.
TaBLE S6 | Number and proportion of genotypes from different countries assigned to each of the three gene pools identified by under the assignation probability of Q ≥ 0.8.  (CC 22 ,CC 43 ,and CC 75 ). (x) Corresponds to the presence of the accession in each core collection. Twenty-two varieties (CC 22 ) were sampled by the CoreHunter program when optimizing the Shannon and Weaver index "Sh strategy". CC 22 was then used as a kernel in the Mstrat program to reconstruct the extended core collections: 43 varieties (CC 43 ) allowed us to capture all of the observed alleles, and 75 varieties (CC 75 ) included the remaining reference varieties not sampled in CC 43 . FIGURE S1 | Optimal number of clusters using the program and inferred population structure from K = 2 to K = 4 for 395 distinct genotypes from both collections. H′ represents the similarity coefficient between runs for each K, and ΔK represents the ad hoc measure of Evanno et al. (2005).