Recent and Historical Gene Flow in Cultivars, Landraces, and a Wild Taxon of Cucurbita pepo in Mexico

Gene flow among crops and their wild relatives is an active study area in evolutionary biology and horticulture, because genetic exchange between them may impact their evolutionary trajectories and increase the genetic variation of the cultivated lineages. Mexico is a center of diversity for the genus Cucurbita that includes pumpkins, squash and gourds. Gene flow between domesticated and wild species has been reported as common in Cucurbita; but gene flow among populations of C. pepo ssp. pepo from Mexico and its wild relative has not been studied. We used 2,061 SNPs, derived from tunable genotyping by sequencing (tGBS) to estimate gene flow among 14 Mexican traditional landraces of C. pepo ssp. pepo, also including individuals from five improved cultivars of C. pepo ssp. pepo and C. pepo ssp. ovifera var. ovifera, and individuals of their wild relative C. pepo ssp. fraterna. We found moderate to high levels of genetic diversity, and low to moderate genetic differentiation. In the test of introgression between lineages, we found that all possible arrangements for ancestral and derived sites between the lineages showed similar frequencies; thus, incomplete lineage sorting, but also gene flow, might be taking place in C. pepo. Overall, our results suggest that gene flow between these subspecies and cultigens, incomplete lineage sorting and the retention of ancestral characters shaped the evolutionary trajectory of C. pepo in its area of origin and diversification. In addition, we found evidence of the use of Mexican landraces as genetic material for the improvement of commercial cultivars. The landraces of Mexico are an important source of genetic diversity for C. pepo, which has been preserved both by management practices of small farmers and by the natural gene flow that exists between the different crop fields of the region.


INTRODUCTION
Gene flow has an important role in the evolution of populations, as for instance even a small amount of gene flow is able to counteract the effects of genetic drift and selection (Ellstrand et al., 1999), and may also introduce useful genetic variation and increase adaptation (Aguirre-Planter, 2007). The amount of gene flow in wild plants, including crop wild relatives (CWRs), can strongly vary among species, and in a given species, among populations and seasons (Ellstrand, 1992), playing an important role in defining population structure. The levels of gene flow between domesticated plants and CWRs are also expected to be highly variable depending on space-time factors, including sexual compatibility, geographic proximity, phenology and selection regimes (Ellstrand et al., 1999;Papa, 2005).
For some domesticated species and CWRs, including landraces of beans and cultivated maize in Mesoamerica, gene flow is usually greater from crops to the wild populations (Papa and Gepts, 2004;Martínez-Castillo et al., 2007;Moreno-Letelier et al., 2020). This is probably due to the higher population sizes of domesticated populations compared to wild populations, as well as different regimes of selection in cultivated and wild environments (Papa and Gepts, 2004). In addition, as alleles of the genes associated with domestication traits are usually recessive, first generation hybrids will show a phenotype similar to that of the wild parent, so they will be much better adapted to nature than to the cultivated environment. For this reason, selection in the cultivated environment will act mainly through the first hybrid generation, while in the wild it will act mainly in the following generations, allowing recombination and thus the introgression of alleles from the domesticated population (Papa, 2005).
Wild populations are a potential source of genetic diversity for crops (Hufford et al., 2013;Sawler et al., 2013). Reports from the early eighteenth-century describe the use of wild species to transfer, through gene flow, resistance to pests and diseases to crops (Prescott-Allen and Prescott-Allen, 1986). Modern plant breeders continue to exploit the genetic diversity of CWRs to detect genes and traits for crop improvement (Egan et al., 2018).
Pumpkins and squash (Cucurbita spp.) are among the most important horticultural crops worldwide, providing food products (Lira, 1995;Ferriol and Picó, 2008;Kates, 2019;Wehner et al., 2020). Mexico is the center of origin and center of diversity of some species of Cucurbita, as well as a diversification area for pumpkins CWRs (Lira et al., 2016). Most species of wild Cucurbita distribute in Mexico and coexist in sympatry with their domesticated counterparts (Lira et al., 2016). One of the characteristics of the genus Cucurbita is that there is the possibility of reproduction between domesticated plants and their CWR (Whitaker and Bemis, 1965;Decker-Walters and Walters, 1988;Wilson, 1990). Nevertheless, in Mexico few studies have focused on analyzing gene flow between CWRs (i.e., C. argyrosperma ssp. sororia and C. pepo ssp. fraterna) and domesticated pumpkins (Wilson et al., 1994;Montes-Hernández and Eguiarte, 2002;Sánchez-de la Vega et al., 2018). These authors found evidence supporting gene flow at a local level (Wilson et al., 1994;Montes-Hernández and Eguiarte, 2002), but low levels of gene flow at a regional scale (Sánchez-de la . However, the evaluation of gene flow in pumpkins at the genomic level for C. pepo ssp. pepo in Mexico has not been explored. Within cultivated pumpkins, Cucurbita pepo is the most widely cultivated and economically important of the genus Cucurbita (Paris, 2016). Cucurbita pepo consists of five taxa, three CWRs: C. pepo ssp. fraterna, C. pepo ssp. ovifera var. texana, and C. pepo ssp. ovifera var. ozarkana; and two cultivated: C. pepo ssp. pepo and C. pepo ssp. ovifera var. ovifera. Two independent domestication events have been documented for C. pepo (Decker, 1988;Wilson et al., 1992;Decker-Walters et al., 1993;Sanjur et al., 2002;Ferriol et al., 2003;Paris et al., 2003;Gong et al., 2012;Castellanos-Morales et al., 2018). One is presumed to have occurred 10,000 years ago in Mexico, and corresponds to C. pepo ssp. pepo (Smith, 1997;Sanjur et al., 2002) possibly from C. pepo ssp. fraterna (Andres, 1987;Castellanos-Morales et al., 2019). However, there is also genetic evidence indicating that C. pepo ssp. fraterna is more closely related to C. pepo ssp. ovifera than to C. pepo ssp. pepo (Decker-Walters et al., 1993;Gong et al., 2012;Kates et al., 2017). The second domestication event could have taken place 5,000 years ago in Southeast United States for C. pepo ssp. ovifera var. ovifera, likely from C. pepo ssp. ovifera var. ozarkana (Decker-Walters et al., 1993;Sanjur et al., 2002;Smith, 2006).
Gene flow between different crops and sympatric CWRs for C. pepo has been previously studied (Decker and Wilson, 1987;Kirkpatrick and Wilson, 1988;Spencer and Snow, 2001;Decker-Walters et al., 2002), but not yet in Mexico, where traditional and local landraces of C. pepo ssp. pepoincluding güiches, güicoy and other pumpkins-are found . These landraces have evolved in diverse environments, resulting in remarkable variation among them in traditional farming communities from Mexico to Central America (Lira, 1995;Castellanos-Morales et al., 2019).
Cultivation of Cucurbita pepo ssp. pepo in Mexico is carried out in both, the traditional (milpa) system-where landrace pumpkins are grown-and the intensive agricultural system, where modern cultivars, zucchini and vegetable marrow are grown (Mera Ovando et al., 2011). Therefore, in this study we estimated the ancestral and recent gene flow between the different subspecies and cultigens of both wild and domesticated C. pepo, and we discuss the consequences that this may represent for wild populations and for Mexican landraces. Moreover, the presence of commercial cultivars in the vicinity of traditional Mexican landraces and wild populations is a relevant factor for gene flow, and the direction of gene flow between taxa may vary according to the effective size of the populations (Ellstrand and Elam, 1993). In home gardens, where C. pepo Mexican landraces are grown, they usually have few individuals, so gene flow between crops and CWRs is expected to be low (Ellstrand et al., 1999), while in a horticultural crop managed with industrial farming practices, which can contain thousands of plants, the potential for such gene flow is expected to be higher (Ellstrand et al., 1999).

Sampling
We initially analyzed a total of 95 plants of different subspecies and cultigens of C. pepo: C. pepo ssp. pepo from landraces collected in Mexico (n = 70), C. pepo ssp. fraterna (n = 8), C. pepo ssp. ovifera var. ovifera (n = 5) and commercial cultivars (n = 12) ( Table 1). For the Mexican landraces of C. pepo spp. pepo, we determined 13 sampling points from the distribution documented by CONABIO project KE004 (database available at http://www.snib.mx/iptconabio/resource? r=SNIB-KE004), covering most of its distribution range in Mexico (Figure 1). The samples come from collections carried out during the years 2014-2015 directly from markets and farmlands in different areas of the country, as well as accessions donated by the Germplasm Bank of the Bajío Experimental Field of the INIFAP located in Celaya, Guanajuato (Mexico), including C. pepo ssp. fraterna sampled in 2011 that were collected and identified by Dr. Salvador Montes-Hernández (Table 1).

Seed Germination and DNA Extraction
In order to have a good representation of the genetic diversity found in C. pepo, 15-20 seeds per locality and accession were sown. In localities where there was more than one fruit, seeds were taken from different fruits. For the samples from the Germplasm Bank, we selected those samples that came from the fruits originally collected and that have not been bred, to minimize both the risk of genetic contamination and the loss of genetic diversity. Each plant was grown in a pot with a mixture of fertile soil under controlled temperature conditions (at an average 35 • C). The plants were kept in a greenhouse for 30 days, at which time a leaf tissue sample of approximately 2 cm 2 was collected, which was used to carry out DNA extraction.
DNA extraction was performed with the DNeasy Plant Mini Kit (Qiagen), following the supplier's instructions. We ran 1% agarose gels to evaluate the integrity of the DNA. We used nanodrop (Thermo Fisher Scientific) to determine the purity of the extractions and Qubit BR dsDNA reactions (Thermo Fisher Scientific) to estimate the DNA concentration per sample. DNA extractions that had purity values 1.8-2.0 for A260/A280 and DNA concentrations >20 ng/ml were sent to DATA2BIO 1 for library preparation for tunable genotyping-by-sequencing (tGBS; Ott et al., 2017). Sequencing was performed with Ion Torrent platform also by DATA2BIO (see text footnote 1), resulting in a total of 124,783,053 reads (Available in the National Center of Biotechnology Information, under the BioProject accession PRJNA485527).

SNP Calling
We checked the PHRED quality of the sequence reads with FastQC V. 0.11.1 (Andrews, 2010). Subsequently, the data were cleaned with fastxtools V 0.0.14 (Gordon and Hannon, 2010), retaining the sequences with PHRED values > 15 in 80% of their bases. The remaining sequences were mapped using Segemehl V. 0.3.4, which is a recommended program for data sequenced 1 www.data2bio.com with Ion Torrent (Caboche et al., 2014). We used the available genome assembly of the C. pepo ssp. pepo Zucchini MU-CU-16 (Montero-Pau et al., 2018) as the reference for our read mapping. Reads that had unique mapping to the reference genome were kept. We used Bcftools V. 1.9 for SNP calling, using a mapping quality and PHRED values of at least 20 for each called SNP. Subsequently, we used VCFtools V. 0.1.1 (Danecek et al., 2011) to retain the sites with a minimum sequencing depth of 5X per site per individual. The resulting SNPs were filtered with Plink V. 1.9 FIGURE 1 | Distribution map of the localities where the different samples of Cucurbita pepo ssp. pepo (purple for landraces, and blue for improved, i.e., commercial cultivars), C. pepo ssp. fraterna (wild, yellow), and C. pepo ssp. ovifera var. ovifera (ornamental, black) were collected. (Purcell et al., 2007), removing sites and individuals with more than 75% missing data, and with a minimum allele frequency (MAF) <0.01. SNPs that did not pass the exact Hardy-Weinberg equilibrium test were eliminated with a cutoff value p < 0.00001, as well as SNPs with linkage disequilibrium greater than 0.3 that were within 1,000,000 bp and sliding windows with 100 bp jumps. We retained a total of 2,061 SNPs, and a total of 88 individuals of the different subspecies and cultigens of C. pepo, which are grouped as: (1) Mexican landraces, which refer to C. pepo ssp. pepo pumpkins from landraces that were collected in Mexico (n = 64); (2) Wild group, which consists of samples from C. pepo ssp. fraterna (n = 8); (3) Ornamental group which is formed by samples from C. pepo ssp. ovifera var. ovifera (n = 5); (4) Improved group, which consists of several commercial cultivars described in Table 1 (n = 11).

Genetic Diversity
We estimated the basic measures of genetic diversity [private alleles (Pa), observed heterozygosity (H O ), expected heterozygosity (H E ), and inbreeding coefficient (F IS )] with the Stacks V. 2.54 population module (Catchen et al., 2013), for the subspecies and cultigens of C. pepo, as well as for each commercial cultivar and each locality where the landraces of C. pepo were collected.
As sample size for each locality was different, we performed a difference test for H E with the R package adegenet V. 2.1.3 (Jombart, 2008;Jombart and Ahmed, 2011), with 10,000 simulations and a two-sided alternate hypothesis, to test if genetic diversity showed statistical differences among localities.

Population Structure and Historical Gene Flow
To infer the ancestral gene flow for C. pepo, three different tests were carried out. The first test was a principal component analysis (PCA), calculated with Plink V. 1.9 (Purcell et al., 2007), and the data was visualized with the R package tidyverse V.
1.3.0 (Wickham et al., 2019). The second test was an analysis of individual ancestry by maximum likelihood. For this we used the Admixture V. 1.3 program (Alexander et al., 2009), where we tested K-values from 2 to 12. We performed a cross-validation test to determine the best K-value. For the analysis, a single run was performed with the values predetermined by the program, that is, a bootstrap of 200, with random seed number selection, with a criterion of stopping each run when the probability of recording each run increases by less than ε = 10 −4 between iterations. The third test was a pairwise F ST analysis, which is an indirect method for estimating historical and recent gene flow (Slatkin, 1987). We used the "exact test of sample differentiation based on genotype frequencies" implemented in Arlequin V. 3.5.2.2 (Excoffier and Lischer, 2010). This estimate is based on the hypothesis of random distribution of individuals between pairs of populations as described in Raymond and Rousset (1995) and Goudet et al. (1996). We compared the differences between wild, ornamental, commercial and Mexican landraces using 100,000 MCMC iterations and a duration of 10,000 dememorization steps, with a P cutoff of 0.05.
To detect signs of introgression between the studied lineages, we obtained a second dataset. We reanalyzed the samples for 95 individuals of C. pepo and included 10 individuals of C. argyrosperma ssp. sororia as an outgroup. We repeated the quality filters, mapping, and SNP calling with the same parameters used before (see "SNP Calling"). We retained a total of 1,721 SNPs, and a total 91 individuals, the same 88 from C. pepo as before plus three samples from C. argyrosperma ssp. sororia  Table 1).
To assess evidence of introgression between lineages or gene flow between C. pepo ssp. fraterna and cultigens, we conducted an ABBA-BABA test with the Dsuite V 0.4 program (Malinsky et al., 2021). We calculated the D-statistic along with the admixture fraction f with the above dataset (Durand et al., 2011). This method analyzes the correlation between allele frequencies across populations to conduct a formal test for the history of admixture between populations, and it has been proved robust under most demographic scenarios (Malinsky et al., 2021). The test uses a four-taxon tree {Outgroup [P1, (P2, P3)]}, including an outgroup (O) that defines the ancestral and derived alleles (defined as "A" and "B, " respectively) in biallelic SNPs. The site patterns are ordered, so a "BBAA" pattern indicates that lineages P1 and P2 share the derived allele (B), while an "ABBA" pattern suggests that P2 and P3 share the derived allele, and a "BABA" pattern reflects that P1 and P3 lineages share the derived allele (Malinsky et al., 2021). To assess which pattern is most likely, the D-statistic estimates the number of times each of these patterns ("BBAA, " "ABBA" and "BABA") conflict with the tree. If these patterns occur in equal frequencies, the statistic has a value of D = 0,-we accept the null hypothesis of no introgression-indicating that incomplete lineage sorting is the prevailing process. Conversely, if introgression is occurring between lineages there will be an excess of either "ABBA" or "BABA" patterns, resulting in D > 0 if the introgression is between P2 and P3 or D < 0 if introgression occurred between P1 and P3 (Durand et al., 2011;Malinsky et al., 2021). We analyzed all possible arrangements between lineages (Mexican landraces, wild, improved and ornamental), using C. argyrosperma ssp. sororia as outgroup, and the D-statistic significance was calculated by jackknifing blocks of 20 bp.

Recent Gene Flow
Current migration among C. pepo accessions was evaluated with BayesAss V. 3.0.4 (BA3) (Mussmann et al., 2019). This Bayesian method is based on Markov chain Monte Carlo (MCMC) to estimate the posterior probabilities of recent migration rates, i.e., in the last two generations (Wilson and Rannala, 2003). Parameter values were estimated after 10 million burn-in steps to allow stabilizing or convergence out of 100 million iterations sampling every 10,000 iterations. To compute suitable allele frequency (a), inbreeding coefficient (f) and migration rate (m), we tested several values until we obtained the acceptance percentages recommended by BayesAss authors (between 20 and 60%). We examined the convergence for each run using Tracer v 1.6 and we determined the final values as: a = 0.30 (62% accepted), f = 0.08 (37% accepted), and m = 0.20 (48% accepted).

Genetic Diversity
We analyzed a total of 2,061 SNPs for 88 individuals.  Table 2). Nevertheless, we did not find significant differences in H E values between taxonomic groups (Supplementary Table 3), and among populations (Supplementary Table 4), probably due to the small sample size, so these data must be taken with care.
The landraces showed the highest inbreeding coefficients (F IS ), with an average F IS = 0.116 and the highest number of private alleles (Pa = 263), while the lowest inbreeding coefficients were found in an ornamental mix (F IS = 0.037) and the wild subspecies (C. pepo ssp. fraterna; F IS = 0.055) ( Table 3). However, as in H E , the inbreeding coefficient shows highly variable ranges that go from F IS = -0.022 in Tlaxcala, to F IS = 0.126 in Michoacán (Supplementary Table 2).

Population Structure and Historical Gene Flow
In the principal component analysis (PCA), we observed three clusters (Figure 2). The first cluster is found at the center of the quadrants and contains almost all wild individuals (C. pepo ssp. fraterna). The second cluster is located in the upper right quadrant and is formed by only individuals belonging to landraces from Mexico. The last cluster is located in the lower right quadrant and consists of individuals of the four different groups of C. pepo, although individuals are more widely dispersed, especially landraces.
Pairwise F ST values among the three subspecies of C. pepo and their improved cultivars were not statistically significant ( Table 3). This suggests low genetic differentiation among C. pepo crops and CWRs. The lowest values of genetic differentiation were found among Mexican landraces and the wild subspecies (F ST = 0.0064), as well as among the Mexican landraces and the ornamental mix (F ST = 0.0067) ( Table 3).
According to the Admixture analysis, the most likely values of K were 2 (CV = 0.386) and 4 (CV = 0.387) (Figure 3). For K = 2, the most common cluster (Figure 3A) is found in 82.7% of the samples, and is represented by purple. For K = 4 (Figure 3B), the most common cluster is found in 64.8% of the samples (purple in Figure 3B); followed by the black cluster and the yellow cluster with 15.59 and 12.06% of the samples, respectively. The cluster with the least presence is the blue one, with only 7.53% of the   samples assigned to this genetic group. For K = 4, we see that the Mexican landraces are mainly assigned to the purple cluster, as well as individuals of the wild subspecies (i.e., CWR). Most of the individuals from commercial cultivars analyzed show a combination of genetic clusters, and there is only one individual of the ornamental mix with a small component of the yellow cluster. The blue cluster is represented mainly by the improved or commercial cultivars and the Mexican landrace accession of San Luis Potosí, as well as by a very low proportion in the rest of C. pepo ( Figure 3B). For the ABBA-BABA test, the counts of the sites "AABB, " "ABBA" and "BABA" are similar for all possible arrangements between the lineages, causing the D-statistics values to be low. In other words, the analysis did not show evidence of hybridization among the Mexican landraces and any other lineage (Table 4). Nevertheless, it is worth noticing that two of the tests were significant (P < 0.05) with values of D > 0, with higher counts for "AABB" sites followed by "ABBA" sites ( Table 4).

Recent Gene Flow
BayesAss analyses (Figure 4) suggests that migration occurs mainly from the Mexican landraces of C. pepo ssp. pepo to the rest of the analyzed groups, and the wild subspecies C. pepo ssp. fraterna appears as the main receiver. In contrast, C. pepo ssp.
fraterna was the source population with the lowest fraction of migrants (ranging from 0.5 to 3.75%).

Genetic Diversity
In general, domesticated taxa are expected to show lower genetic diversity, when compared to CWRs, because domestication generally represents a bottleneck where domesticates are usually descendants of a few wild individuals (Doebley, 1989;Gepts and Papa, 2003;Gross and Olsen, 2010;Meyer and Purugganan, 2013;Gepts, 2014;Gaut et al., 2018). Contrary to this pattern, domesticated C. pepo ssp. pepo from Mexico (H E = 0.196) showed slightly higher genetic diversity than its wild relative C. pepo ssp. fraterna (H E = 0.185) ( Table 2), but these differences were not significant (Supplementary Table 4). This pattern has been previously reported for nuclear microsatellite loci and chloroplast sequences in C. pepo The genetic diversity in C. pepo is higher than that reported for C. argyrosperma, also estimated with SNPs (Barrera-Redondo et al., 2021). The latter may relate to C. pepo being the pumpkin species with higher commercial importance and higher morphological diversity (Paris, 2001). Accordingly, we expect cultivated C. pepo to show high effective population sizes, and high dispersal through anthropogenic movements, which is further supported by a lack of genetic structure, in contrast to C. argyrosperma and C. moschata in Mexico (Sánchez-de la Castellanos-Morales et al., 2019;Hernández-Rosales et al., 2020;Barrera-Redondo et al., 2021). In our analyses, the ornamental mix (C. pepo ssp. ovifera var. ovifera) shows the lowest value of genetic diversity (H E = 0.176). These cultivars have been subjected to intense agricultural management, which causes strong bottlenecks and very intense selective pressures (Formisano et al., 2010). However our results must be interpreted with caution, given that our sample size is small, and the analysis includes only a few representatives of this subspecies and none of its wild populations. We recommend in the future to analyze the wild populations of C. pepo ssp. ovifera var. texana and C. pepo ssp. ovifera var. ozarkana with large sample sizes and genomewide data to compare the levels of genetic diversity between all C. pepo CWRs.
The admixture and PCA analyses indicate that the San Luis Potosi and "Vegetable Spaghetti" samples show genetic similarity (Figures 2, 3). This, together with the low genetic diversity found in San Luis Potosi, may suggest that the analyzed individuals from San Luis Potosi share ancestry with "Vegetable Spaghetti" commercial cultivar (blue genetic cluster in Figure 3B). In  The numbers in "AABB" "ABBA" "BABA" indicate the number of times each pattern for the shared derived allele occurred in the analysis, standardized by the allele frequencies.
certain parts of Mexico, the traditional cultivation of milpa is still practiced, which helps maintain the genetic diversity of populations (Mastretta-Yanes et al., 2018). However, in other regions of the country -such as San Luis Potosi, which is part of the Bajio region, an important agricultural area where many crops are grown for commercial purposes-there has been a replacement by modern agricultural systems, where seeds of improved cultivars are used (Mera Ovando et al., 2011;Sánchez-de la Vega, 2017). It has been suggested that the genetic diversity stored in landraces allows crops to adapt to harsher environments and makes cultivars less susceptible to pests and diseases (Mastretta-Yanes et al., 2018). Traditional agricultural systems should be preserved to maintain the genetic diversity of Cucurbita crops (Montes-Hernández et al., 2005), and future analyses that explicitly test the effects of different management practices -milpa in contrast to intensive agriculture, on the genetic diversity of pumpkins need to be conducted. We also found high inbreeding values, especially in Mexican landraces, which can be related to the use of less than 10 fruits from each harvest for next year's harvest, and also to the fact that the exchange of seeds between different farmers is uncommon (but see Montes-Hernández and Eguiarte, 2002;Montes-Hernández et al., 2005). Inbreeding can help establish favorable phenotypic characters in domesticated species, and in some cases farmers safeguard, promote, select and control the reproductive processes of a few individuals (Casas and Parra, 2016). A rapid fixation of the desired characters through artificial selection in the domestication and improvement processes can be acquired by a certain degree of inbreeding (Ellstrand and Elam, 1993). However, we cannot rule out the presence of Wahlund effect in our estimates of inbreeding, since our samples come from many fields across Mexico.

Genetic Structure and Historical Gene Flow
The genetic differentiation among subspecies and cultigens of C. pepo estimated with SNPs was very low and not significant, except between Mexican landraces and improved cultivars (F ST = 0.0203) ( Table 3). This result may relate to (1) possible historical or recent gene flow, (2) incomplete lineage sorting, or (3) retention of ancestral characters.
The facility for spontaneous hybridization in C. pepo has been well documented since the 18th century (Paris, 2000, Paris, 2001. In particular, gene flow between wild (C. pepo ssp. fraterna) and C. pepo cultigens was not expected, because of their current allopatric distributions. Nevertheless, species distributions may change through time, as a response to climate and other environmental changes. In this sense, species distribution models suggest that C. pepo ssp. fraterna could have had a wider range during the mid-Holocene (Kistler et al., 2015;Castellanos-Morales et al., 2019), reaching areas near the Tehuacán and Ocampo Caves, where archeological remains have been found (Whitaker et al., 1957;Hanselka and King, 2017). Therefore, past gene flow between C. pepo ssp. fraterna and ancestral Mexican landraces cannot be ruled out.
Incomplete lineage sorting in C. pepo may also result in low levels of genetic differentiation and has been cited as an explanation for the differences in the relationship between taxa found in phylogenetic analyses based on different molecular markers and methods of analyses (Sanjur et al., 2002;Kistler et al., 2015;Kates et al., 2017;Castellanos-Morales et al., 2018. Accordingly, previous genetic studies, and linguistic evidence (Brown et al., 2012), have suggested a rapid human mediated dissemination after domestication or geographically widespread evolution of domesticated forms, as well increases and decreases in the effective population size of C. pepo ssp. fraterna throughout its history, and the increase in the effective population size of Mexican local varieties of C. pepo ssp. pepo (Zizumbo and Colunga, 2008;Kistler et al., 2015;Castellanos-Morales et al., 2019). All of the above may have favored incomplete lineage sorting.
Retention of ancestral alleles is an equally likely explanation for low levels of genetic differentiation between C. pepo groups found in this study. Phylogenetic analyses, dates of divergence estimated based on molecular clock and the archeological records suggest that the divergence of these taxa is very recent (Whitaker et al., 1957;Smith, 1997Smith, , 2006Kistler et al., 2015;Hanselka and King, 2017;Castellanos-Morales et al., 2018. Also, Xanthopoulou et al. (2019) reported moderate linkage disequilibrium in C. pepo, and a high proportion of shared SNPs between subspecies. All of the above suggest that not enough time has passed for the differential fixation of alleles in neutral nuclear markers in these taxa. Moreover, pumpkins of Mexico and Guatemala are morphologically the most similar to wild gourds, reflecting a possible earlier stage of domestication, which is further observed in the persistence of the round shape of the fruit (Paris, 2001). Deeper genomic analysis will be necessary to analyze the retention of ancestral alleles in this species.
The admixture analysis ( Figure 3B) showed that the improved cultivars have the highest assignment probability to the blue clusters, as well as the presence of crossed individuals with the other genotypes. The latter may occur because different cultigens of C. pepo have been crossed to breed improved cultivars (Ferriol and Picó, 2008), as well as for the introduction of genes from other pumpkin species for genetic improvement (Whitaker, 1962;Whitaker and Davis, 1962;Paris et al., 1988;Holdsworth et al., 2016;Castellanos-Morales et al., 2019), especially to produce disease-resistant crops (Robinson and Shail, 1987;Cohen et al., 2003;Formisano et al., 2010;Paris, 2018).
We did not find well-defined groups based on morphology. These results contrast with previous analyses in C. pepo, which found a correlation between genetic clusters and C. pepo morphotypes Gong et al., 2012). These studies focused mainly on commercial cultivars, which have smaller effective population sizes, and are subject to strong artificial selection. As mentioned above, in Mexico the most common morphotype is the Pumpkin type, with morphological characteristics more similar to the ancient groups: large and elongated seeds, lignified shell, round shape, and marked keels Paris, 2001). In fact, previous studies with maize landraces from Mexico have reported that the morphological structure of landraces has no relationship with the genetic structure (Arteaga et al., 2016;Moreno-Letelier et al., 2020), and people select phenotypes that they recognize as their local crop of interest, regardless of the genotype that these crops may have (Louette et al., 1997). In consequence, the lack of genetic differentiation within Mexican landraces of C. pepo might relate to cultural preference toward our local landraces. As a matter of fact, in Mexican markets or supermarkets it is common to find mature round pumpkins (in autumn) or immature zucchini (throughout the year); cultivar such as "Vegetable Spaghetti" are considered a rarity and grown mostly for export.
In the ABBA-BABA analysis, all possible arrangements between the lineages were found in a similar number of times (Table 4), causing the values of the D-statistics to be close to zero. This means that there are no strong signs of introgression between lineages in C. pepo, and incomplete lineage sorting may explain the patterns of genetic structure we have found in Mexico. Nevertheless, two of the values were significant with D > 0, suggesting possible introgression between improved and ornamental mix and between wild and ornamental mix ( Table 4). These results further suggest that gene flow and incomplete lineage sorting may explain the distribution of genetic diversity in C. pepo.
For the "pipiana" squash (C. argyrosperma ssp. argyrosperma) and its wild relative (C. argyrosperma ssp. sororia), Barrera-Redondo et al. (2021) found that many derived sites are shared between taxa-that is, they found an "AABB" pattern -. The latter means that lineages in C. argyrosperma are well differentiated (monophyletic). Also, for C. argyrosperma there was genomic evidence of extant gene flow after initial domestication (Barrera-Redondo et al., 2021). The observed differences between C. pepo and C. argyrosperma may relate to differences in their domestication history-reticulate domestication followed by rapid expansion in C. pepo vs. single domestication followed by gene flow in C. argyrosperma (Andres, 1987;Castellanos-Morales et al., 2019;Barrera-Redondo et al., 2021) -. Also, trade and management patterns differ between these species, as C. argyrosperma ssp. argyrosperma is traded mainly at a local level and there are well defined genetic clusters in this taxon (Sánchez-de la Barrera-Redondo et al., 2021), while trade for C. pepo ssp. pepo occurs at a larger scale (Lira, 1995;Zizumbo and Colunga, 2008;Mera Ovando et al., 2011). Larger effective population size in C. pepo and gene flow between different geographic regions, may be causing the patterns of derived sites vs. ancestral sites to blur in this taxon. However, as previously mentioned, deeper genomic analyses that include a wider sample of the wild subspecies should be conducted in C. pepo.

Recent Gene Flow
The BayesAss analysis suggests that gene flow is more common from landraces to CWRs and to commercial cultivars of pumpkins (Figure 4). As previously mentioned, these signals of high levels of gene flow could relate to the confusing effect of gene flow, incomplete lineage sorting, and retention of ancestral characters.
Our study focuses on pumpkin landraces from Mexico, which may be found in sympatry with other wild Cucurbita species in different areas of its distribution. Cucurbita flowers are pollinated by bees, which can spread the different alleles between nearby cultivated plots, maintaining or increasing the genetic diversity of the populations. In fact, it has been documented that Peponapis and Xenoglossa bees (specialized pumpkin pollinators), can promote the movement of pollen at distances of at least 700 m (Kohn and Casper, 1992), while the common bees (Apis mellifera), as well as the small Lepidoptera and Syrphidae, promote gene flow at distances up to 1,000 m (Ellstrand and Marshall, 1985;Ellstrand et al., 1999). Furthermore, there is a report in the United States that gene flow between C. pepo ssp. pepo and C. pepo ssp. ovifera var. texana pollinated by Xenoglossa angustior and Peponapis pruinosa can reach distances up to 1,300 m (Kirkpatrick and Wilson, 1988). However, although gene flow can help maintain their genetic diversity, each generation of pumpkin is still subject to inbreeding and genetic drift associated with the selection by farmers of a few parental individuals for production the following year (Montes-Hernández et al., 2005).
We estimated significant gene flow from landraces to improved and ornamental cultigens of C. pepo (Figure 4). This may be indicating that landraces of Mexico are still being used as a base or parental group for the generation of improved cultivars, which is consistent with our admixture result (Figure 3). Nevertheless, this may also relate to incomplete lineage sorting, the retention of ancestral characters and a hybrid origin of some improved cultivars (Kates et al., 2017;Castellanos-Morales et al., 2018. Estimated gene flow of the wild subspecies C. pepo ssp. fraterna to landraces and commercial cultivars was low (Figure 4). This is to be expected, because the area where C. pepo ssp. pepo grows in Mexico has little or no overlap with the restricted distribution of its wild relative Kates, 2019). However, we can observe that this pattern is very different in the opposite sense, that is, that C. pepo ssp. fraterna seems to be receiving a lot of gene flow from other crops, mainly landraces, even when they do not grow in sympatry. Nonetheless, C. moschata and C. argyrosperma ssp. argyrosperma grow in sympatry with C. pepo ssp. fraterna in the locality of Vado el Moro, from where our samples were collected, and as previously reported by Wilson et al. (1994), interspecific gene flow may be taking place. The latter is something relevant to consider, because gene flow can lead to the loss of genetic identity and genetic diversity of the wild species, as well as to the detriment of its populations (Ellstrand et al., 1999;Campbell et al., 2016). Future studies should focus in estimating interspecific gene flow between Cucurbita crops and their wild relatives where they coexist.
In this study, we found that the Mexican landraces appear to be a genetic reservoir for the management and development of the improved and ornamental cultivars, more than the wild populations, which seem to be preserved by both, the management techniques of small farmers and gene flow between different milpas in each area. Therefore, we strongly recommend implementing government programs that promote the conservation of the milpa farming system. We also recommend monitoring introgression from modern cultivars in Mexico, to avoid the loss of identity and genetic diversity of local landraces, as well as important alleles for the adaptation of crops to inhospitable environments.

DATA AVAILABILITY STATEMENT
The SRA records is in the NCBI database. SRA records will be accessible with the following link after the indicated release date: https://www.ncbi.nlm.nih.gov/sra/PRJNA485527.