Assessing Population Genetic Status for Designing Plant Translocations

Assisted gene flow interventions such as plant translocations are valuable complementary techniques to habitat restoration. Bringing new genetic variants can contribute to increasing genetic diversity and evolutionary resilience, counteract inbreeding depression and improve plant fitness through heterosis. Large, highly genetically variable populations are usually recommended as sources for translocation. Unfortunately, many critically endangered species only occur as small populations, which are expected to show low genetic variation, high inbreeding level, paucity of compatible mates in self-incompatible species, and increased genetic divergence. Therefore, assessment of population genetic status is required for an appropriate choice of the source populations. In this paper, we exemplify the different analyses relevant for genetic evaluation of populations combining both molecular (plastid and nuclear) markers and fitness-related quantitative traits. We assessed the genetic status of the adult generation and their seed progeny (the potential translocation founders) of small populations of Campanula glomerata (Campanulaceae), a self-incompatible insect-pollinated herbaceous species critically endangered in Belgium. Only a few small populations remain, so that the species has been part of a restoration project of calcareous grasslands implementing plant translocations. In particular, we estimated genetic diversity, inbreeding levels, genetic structure in adults and their seed progeny, recent bottlenecks, clonal extent in adults, contemporary gene flow, effective population size (Ne), and parentage, sibship and seed progeny fitness variation. Small populations of C. glomerata presented high genetic diversity, and extensive contemporary pollen flow within populations, with multiple parentage among seed progenies, and so could be good seed source candidates for translocations. As populations are differentiated from each other, mixing the sources will not only optimize the number of variants and of compatible mates in translocated populations, but also representativeness of species regional genetic diversity. Genetic diversity is no immediate threat to population persistence, but small Ne, restricted among-population gene flow, and evidence of processes leading to genetic erosion, inbreeding and inbreeding depression in the seed progeny require management measures to counteract these trends and stochastic vulnerability. Habitat restoration facilitating recruitment, flowering and pollination, reconnecting populations by biological corridors or stepping stones, and creating new populations through translocations in protected areas are particularly recommended.


INTRODUCTION
Assisted gene flow interventions such as plant translocations are valuable complementary techniques to habitat restoration based on ecological management practices (Zimmer et al., 2019;Gargiulo et al., 2021). Plant translocations consist of intentionally introducing individuals (seeds, cuttings or plug plants) in genetically depauperate, inbred or extirpated populations that cannot recover through pollen flow or through recolonization by seeds from soil seed banks or from nearby populations (Menges, 2008;Weeks et al., 2011). Bringing new genetic variants can contribute to increasing genetic diversity and evolutionary resilience, counteract inbreeding depression and improve plant fitness through heterosis (Zavodna et al., 2015;Van Rossum et al., 2020;Van Rossum and Le Pajolec, 2021). However, translocation-based population restoration may fail for several reasons. Poor habitat conditions have been reported as a determinant factor in failure, resulting in high mortality of transplants and low recruit establishment (Godefroid et al., 2011;Schäfer et al., 2020;Volis and Blecher, 2021). Other studies found high inbreeding levels in transplants or in their offspring as a result of a low number of genotypes or reduced pollination service in source populations (Krauss et al., 2002;Monks et al., 2021;Van Rossum and Le Pajolec, 2021). If the introduced gene pool is too genetically differentiated from the natural pool or if the propagated transplants originate from several genetically differentiated sources, founders may be maladapted, and outbreeding depression may be found in the outcrossed progeny as a result of a breakdown of the coadapted genes (Edmands, 2007;Bowles et al., 2015;Barmentlo et al., 2018). Therefore, carefully preparing the translocation is essential for optimizing the success in achieving genetically and demographically viable populations (Godefroid et al., 2016a;Commander et al., 2018) and optimizing population resilience to changing environmental conditions Borrell et al., 2019;Pazzaglia et al., 2021). Several features have to be taken into account in designing the translocation. Species life-history traits, such as longevity, breeding system (e.g., selfincompatibility, dioecy, autogamy), pollen and seed dispersal abilities and clonal ability determine species sensitivity to habitat fragmentation and species response to restoration by influencing mating processes, gene flow and population genetic composition (Charlesworth, 2006;Dudash and Murren, 2008;Berjano et al., 2013;Wiberg et al., 2016;Bittebiere et al., 2020;Tierney et al., 2020;Garcia-Jacas et al., 2021). The selection of sources for translocation material (e.g., seeds, cuttings, tubers) has to be based on ecological characteristics, genetic composition and seed production and quality (Commander et al., 2018;Hoban et al., 2018;Bragg et al., 2020). Planting design, i.e., the number of planted founders, spatial arrangement and management of the sites before and after plant translocation may also affect demographic dynamics and mating processes (Betz et al., 2013;Albrecht and Long, 2019;Silcock et al., 2019;Van Rossum et al., 2020).
Large, highly genetically variable populations are usually recommended as source (one or several in mixture) for translocation (Basey et al., 2015;Hoban et al., 2018;Schäfer et al., 2020). Unfortunately, many critically endangered species only remain as small, spatially isolated populations. Those populations are expected to show low genetic variation, high inbreeding levels, and for self-incompatible or dioecious species, paucity of compatible mates (e.g., Aguilar et al., 2008;Angeloni et al., 2011;Berjano et al., 2013;Ottewell et al., 2016). As a result, seed production may be reduced and of poor quality (Vergeer et al., 2004;Wiberg et al., 2016;Aguilar et al., 2019;Tierney et al., 2020). Small populations may also consist in remnant, senescing populations, with old, possibly highly clonal, individuals still holding historical (pre-fragmentation) genetic variation (e.g., Van Geert et al., 2015;Gargiulo et al., 2019;Thomas et al., 2021;. However, in the current habitat fragmentation context, contemporary gene flow may be restricted due to spatial isolation and pollination disruption (e.g., Ghazoul, 2005;Aguilar et al., 2008). As a result, the seed progeny intended to be used as source for translocation may be genetically depauperate and inbred (Van Geert et al., 2008;Van Rossum, 2008;Thomas et al., 2021;Van Rossum and Le Pajolec, 2021). Small, isolated populations may also become genetically differentiated from each other, as a result of inbreeding and genetic drift (Ottewell et al., 2016;Wiberg et al., 2016), with the risk of outbreeding depression (Edmands, 2007;Barmentlo et al., 2018). Therefore, assessment of the genetic status of the populations is important for an appropriate selection of the source populations.
Criteria for selecting populations using genetic tools are detailed in many guidelines for plant translocations (e.g., Maschinski and Albrecht, 2017;Commander et al., 2018;Pazzaglia et al., 2021), and practical studies investigating the genetic status of populations for designing translocations are increasing (e.g., Berjano et al., 2013;Wiberg et al., 2016;Van Rossum and Raspé, 2018;St. Clair et al., 2020;Garcia-Jacas et al., 2021;Gargiulo et al., 2021;Thomas et al., 2021;, but studies combining the use of molecular DNA markers and fitness-related quantitative traits for assessing the genetic status still remain uncommon (Gentili et al., 2018;Borrell et al., 2019). Molecular markers allow for quantifying genetic diversity, inbreeding levels, parentage and sibship, pollen and seed dispersal and therefore gene flow, identifying mating processes, and possible pollination failure (e.g., Hardy et al., 2004;Carroll and Fox, 2008;Llaurens et al., 2008;Jones and Wang, 2010;Bragg et al., 2020;Van Rossum and Hardy, 2021). Effective population size (N e ), which represents a more accurate indicator for population viability than census population size, and bottlenecks that populations may have recently incurred, can also be inferred using molecular markers (Chybicki and Burczyk, 2009;Luikart et al., 2010;Wang, 2016). Fitness quantitative traits inform on negative consequences of habitat fragmentation and population size on reproductive success, on inbreeding or outbreeding depression, and on (mal)adaptation, especially when combined with molecular markers (Zavodna et al., 2015;Barmentlo et al., 2018;Borrell et al., 2019;Van Rossum and Le Pajolec, 2021). Identifying phenotypic plasticity and maternal effects may also give insights into adaptive differences and nonneutral genetic variability accounting for population resilience (Nicotra et al., 2015;Hamilton et al., 2017;Van Rossum et al., 2020). Besides, whether the potential source and target populations share common phylogeographic history (based on plastid markers) has been insufficiently examined (Van Rossum and Raspé, 2018;Vera et al., 2020;Bobo-Pinilla et al., 2021;Jacquemart et al., 2021), despite the risk of reproductive isolation and outbreeding depression in hybrids between genetic lineages that show diverging phylogeographic patterns (Edmands, 2007;Martin et al., 2017).
In this paper, we used plastid and nuclear molecular markers together with fitness-related quantitative traits to exemplify the different analyses that can be relevant to evaluate the genetic status of populations, stressing the results potentially leading to translocation failure. More precisely, we investigated the following factors: (1) genetic diversity, inbreeding levels and detection of recent bottlenecks in the adult generation of potential source or target populations and in the seed progeny that are intended for founding the translocated populations; (2) clonal extent in the adult generation; (3) genetic differentiation among populations; (4) contemporary gene flow, parentage, sibship and contemporary N e ; (5) and maternal effects, phenotypic plasticity and inbreeding depression in fitness-related quantitative traits. The study was carried out on Campanula glomerata L. (Campanulaceae), an insect-pollinated herbaceous species that is critically endangered in Belgium, only consisting of small populations, and that has been involved in a restoration project of calcareous grasslands which has implemented plant translocations (Godefroid et al., 2016a,b). We discuss the implications of our findings for species regional conservation and designing plant translocation protocol using small populations as seed sources.

Study Species
Campanula glomerata L. (Campanulaceae) is a long-lived perennial, herbaceous species occurring on calcareous grasslands, scrub, open woodlands and sand dunes. It has a wide continental Eurasian distribution range, reaching its margin in northwestern Europe where the species becomes rare and where populations are scattered and disjunct (Hultén and Fries, 1986). The species is declining in Western Europe, as a result of the fragmentation and degradation of its habitats, and populations are often small and isolated (Green et al., 1997;Hill et al., 2004;Bachmann and Hensen, 2007;Godefroid et al., 2016a). In Belgium, the species is critically endangered (Saintenoy-Simon et al., 2006) and a species recovery plan has been designed, involving plant translocations in the framework of a Life project in Belgian Lorraine (southern Belgium). After an exhaustive field survey, only small populations of C. glomerata were found in Belgian Lorraine, but also in the surrounding regions in France and Luxembourg (Godefroid, pers. commun.). It was therefore decided to use several small populations as seed sources and to mix them (Godefroid et al., 2016a).
The plants flower from June to August. The flowering stems are up to 70 cm high and bear on average 20-30 (up to 190) violetblue protandrous flowers arranged in clusters, which produce nectar (Denisow and Wrzesień, 2015;Strzałkowska-Abramek et al., 2018;Godefroid and Van Rossum, unpublished data). The species is characterized by a self-incompatible breeding system (Gadella, 1964). The main pollinators are bumble bees, solitary  Supplementary Table S1. The pie charts correspond to mean membership values at K = 8 clusters for the adult generation (except for FON: A, adults and P, seed progeny) of the structure analysis ( Figure 2). bees, honey bees and hoverflies (Albrecht et al., 2007;Denisow and Wrzesień, 2015). The numerous tiny seeds are dispersed over short distances by the shaking of the capsules, and do not form a persistent seed bank in the soil. Plants can live for 25-30 years, and subsist through vegetative spread by rhizomes (Klotz et al., 2002;Hill et al., 2004;Bachmann and Hensen, 2007). Clonal spread was estimated at 0.01-0.25 m/year (Klimešová and Klimeš, 2013).  Godefroid et al., 2016a). From these, 31-101 seed progeny individuals per population were sampled for leaves, and rosette diameter was measured 2 months after germination. A total of 680 samples were dried in silica gel.

DNA Extraction
DNA was extracted from ca. 10-15 mg of dried leaf material using a CTAB method (Doyle and Doyle, 1990) for plastid analyses and with the NucleoMag 96 Plant kit (Macherey Nagel, Duren, Germany) according to the manufacturer's protocol for microsatellite analyses. We estimated the concentration of genomic DNA in the extracts using the Qubit Quantitation Platform (Invitrogen). DNA concentration was standardized to 2 ng/µl.

Plastid DNA Sequencing
Three plastid loci usually showing high polymorphism and commonly used for DNA barcoding (e.g., Hollingsworth et al., 2011;Xia et al., 2019) were tested for amplification and sequencing. The plastid marker rps16 (with rpsF and rpsR2; Oxelman et al., 1997) did not amplify. The two other loci were amplified and sequenced from five individuals per population (excluding FON population) for a total of 35 individuals, with the following with psbA3'f (Sang et al., 1997) and trnHf (Tate and Simpson, 2003), and matK with matK-xF (Ford et al., 2009) and matK-MALPR1 (Dunning and Savolainen, 2010). We followed standard PCR procedures (e.g., Shaw et al., 2005). PCR products were purified by adding 1 U of Exonuclease I and 0.5 U FastAP Alkaline Phosphatase (Thermo Scientific, Germany) and incubated at 37 • C for 1 h, followed by inactivation at 80 • C for 15 min. Sequencing was performed in both directions by Macrogen Europe (The Netherlands) with PCR primers. Chromatograms were edited and assembled in Geneious (Biomatters). Contig sequences are available from GenBank under accession numbers OL943797 (matK) and OL943798 (trnH-psbA).

Nuclear Microsatellite Analysis
We used 15 polymorphic microsatellite loci (excluding CAM19) previously developed by Van Rossum and Godé (2021), but with the protocol adapted as follows: PCR amplifications were performed in 10 µL reactions including 1X multiplex PCR master Mix (QIAGEN, Germany), 3µl (5-20ng) of genomic DNA, and 2.0-6.4 pmol labeled forward and reverse primers (see Table 2 in Van Rossum and Godé, 2021). The PCR cycling consisted of an initial denaturation at 95 • C for 15 min, followed by 35 cycles: denaturation at 95 • C for 30 s, annealing at 55 • C for 45 s and extension at 72 • C for 1 min, and a final extension at 60 • C for 30 min. Multiplexes were electrophoresed on an ABI PRISM 3130 sequencer (Applied Biosystems), with 1.5 µl of PCR product diluted with dH2O (1:10), and 9.75 µl of Hi-Di Formamide (Life Technologies, USA) and 0.25 µl of MapMarker 500 labeled with DY-632 (Eurogentec). Alleles were scored with software GeneMapper version 5 (Applied Biosystems) and Geneious Prime (Biomatters).

Data Analyses and Rationale
The research questions to investigate and the possible problems that might lead to translocation failure discussed here below are summarized in Table 1. For suggestion of other potential data analyses and related software, see e.g., Van Rossum and Hardy (2021), and Bourgeois and Warren (2021) for genomic approaches.

Phylogeographic History
Differences in plastid marker sequences among populations may reveal different phylogeographic patterns related to ancient processes, such as different (post)glacial migration histories, and the existence of contact zones where distinct genetic lineages have met (Hewitt, 2004;Hickerson et al., 2010;. The related genetic divergence among, sometimes co-occurring, genetic lineages may induce reproductive isolation and outbreeding depression in hybrid progeny (Edmands, 2007;Martin et al., 2017), requiring to take phylogeographic patterns into account for source sampling Vera et al., 2020;Bobo-Pinilla et al., 2021). The other research questions relating to genetic diversity (as described in Table 1) were based on the analysis of nuclear microsatellite markers and are described below.

Extent of Clonal Propagation
Populations characterized by high clonal propagation in the adults sampled for translocation material (cuttings, seeds) may show a reduced number of genotypes and of compatible mates compared to what could be expected given census population size, and high inbreeding levels in the seed progeny intended to be used for translocation (Van Rossum and Raspé, 2018;Bittebiere et al., 2020;Van Rossum and Le Pajolec, 2021;. In order to estimate clonal propagation in the adult generation, we identified multilocus genotypes, assigned each sampled adult to a multilocus genotype and calculated the probability (p se ) of finding the same multilocus genotype a second time in each population using GenAlEx 6.5 (Peakall and Smouse, 2012). Identical multilocus genotypes were likely to be putative clones when p se < 0.05, and to result from sexual reproduction when p se > 0.05 (Parks and Werth, 1993). The locus CAM27 showing a lot of missing data due to a high null allele frequency (Van Rossum and Godé, 2021) and 21 individuals with missing data in the other loci were excluded from the analyses. For population showing putative clones, we calculated the proportion of distinct multilocus genotypes as G/N (Ellstrand and Roose, 1987); G is the number of multilocus genotypes and N the number of sampled adults.

Genetic Variation Within Populations
Source material for translocation should show high genetic diversity to ensure evolutionary resilience, also to cope with changing environmental conditions, and low inbreeding levels to avoid inbreeding depression issues (e.g., Carroll and Fox, 2008;Sgrò et al., 2011;Prati et al., 2016;Commander et al., 2018;Hoban et al., 2018). In particular, attention should be paid when small populations are used as sources as they might be inbred and genetically depauperate, in the adult and/or the seed progeny generations (Vergeer et al., 2004;Van Geert et al., 2008;Wiberg et al., 2016;Tierney et al., 2020;Gargiulo et al., 2021).
The following estimates of genetic variation were calculated for each population for the adults and for the seed progeny using GEN-SURVEY (Vekemans and Lefèbvre, 1997) or FSTAT (Goudet, 2003): allelic richness (A [N] ) for a fixed sample size (N = 5, excluding adult data of FON), observed (H o ) and expected (H e ) heterozygosity and Wright's inbreeding coefficient (F IS ), corrected for small sample size. We tested the significance of the F IS values (over all loci) by randomization tests and sequential Bonferroni-type correction. As null alleles were detected at some loci (Van Rossum and Godé, 2021), mean inbreeding coefficient values (F ISnull ) and their 95% highest posterior density intervals (HPDI) were calculated after adjusting allele frequencies with INEST (Chybicki and Burczyk, 2009) to take a possible effect of null alleles into account on F IS values. We applied the Bayesian approach (IIM) with 10 6 Markov Chain Monte Carlo iterations, of which the first 10 5 were discarded as burnin phase, testing two models: a full model (nfb, including null alleles, inbreeding and genotyping failures), and a model (nb) where inbreeding was set at 0. The best fitting model was indicated by the lowest value of the deviation information criterion (DIC). Differences in A [5] , H o , H e , and F IS values between adult and seed progeny generations were tested by performing pairwise Wilcoxon matched pairs tests by locus. Relationships between within-population genetic estimates (G/N, A [5] , H o , H e , F IS and F ISnull ) and flowering population size were examined for each generation by Gamma correlation analyses. Analyses were performed using STATISTICA version 12 (Dell Inc.).
To detect recent genetic bottlenecks in each population, we performed tests for the excess in heterozygosity on each generation (excluding adult sampling with n < 20) using INEST, under the Two-Phase model and with the default settings recommended by Peery et al. (2012). P-values of the Wilcoxon signed-rank tests were determined based on 10 6 permutations.

Genetic Structure Among Populations
Investigating how populations genetically diverged from each other might indicate the occurrence of several processes, such as genetic drift effects and disruption of gene flow (Aguilar et al., 2008;Ottewell et al., 2016). Too high among-population genetic differentiation might lead to transplant maladaptation and outbreeding depression in hybrid progeny (Edmands, 2007;Bowles et al., 2015). Comparing adult and seed progeny generations allows for identifying contemporary processes, especially pollen dispersal patterns (Hardy et al., 2004;Ritchie and Krauss, 2012;Van Rossum et al., 2020;Van Rossum and Le Pajolec, 2021).
We conducted several complementary analyses to describe the genetic structure of the populations. First, pairwise F ST values were calculated between populations and generations according to Weir and Cockerham (1984), excluding FON because of a too small sampling size (n =3), and their significance was tested by randomization tests using FSTAT and Bonferroni correction. Differences in F ST values between adult and seed progeny generations were tested by performing pairwise Wilcoxon matched pairs tests on the same 15 population pairs. Pairwise F ST values were also calculated with the ENA correction using FreeNa (Chapuis and Estoup, 2007) so that the presence of null alleles could be taken into account. Bootstrap resampling over loci using 10,000 replicates were computed using FreeNa to obtain 95% confidence intervals.
Second, Cavalli-Sforza and Edwards' (1967) genetic distances were computed between all pairs of populations and generations, with the INA correction (taking possible null alleles into account), with bootstrapping over loci, using FreeNa. The patterns of between-population differentiation were summarized by performing a neighbor-joining (NJ) cluster analysis on the matrix of Cavalli-Sforza and Edwards' distances using the software Populations 1.2.3.1 (Olivier Langella, Montpellier). The cluster analysis was visualized using FigTree v.1.4.3 (Rambaut, 2016).
Finally, we inferred population structure using STRUCTURE version 2.3.4 (Pritchard et al., 2000) performed with Structure_threader (Pina-Martins et al., 2017). This Bayesian clustering method identifies the number of clusters (K) of distinct gene pools that differ by a set of allele frequencies at each locus. We performed analyses for K = 1 to K = 15 clusters (30 different runs), using an admixture ancestry model with correlated allele frequencies, run length of burn-in period of 10 6 iterations, and 2 × 10 6 Markov Chain Monte Carlo replications. Null alleles were treated as recessive and genotypes with missing data were considered as homozygous for null alleles (Falush et al., 2007). The most likely number of K clusters was inferred based on the ad hoc statistic DeltaK and the highest likelihood value (Evanno et al., 2005), after running STRUCTURE HARVESTER (Earl and vonHoldt, 2012), but results for each K cluster were also checked. The most likely estimated membership (Q) values of the 10 best (with the highest likelihood values) independent runs were obtained with CLUMPP version 1.1.2 (Jakobsson and Rosenberg, 2007) and visualized on barplots using DISTRUCT version 1.1. (Rosenberg, 2004).

Contemporary Pollen Flow and Effective Population Size
Genetic quality of the seeds collected for transplant propagation depends on whether there was effective contemporary pollen dispersal among individuals and populations (Hardy et al., 2004;Vergeer et al., 2004;Menges, 2008;Ritchie and Krauss, 2012;Aguilar et al., 2019). When most seed progeny are genetically related and belong to a few large families, consisting of full sibs (sharing both parents) and half sibs (sharing one of the two parents), or of selfed offspring (in case of self-compatible species), the number of compatible mates can be reduced and high levels of inbreeding may be expected in the next generation (Aguilar et al., 2019;Thomas et al., 2021). Therefore, it is important to verify that parentage contributions were spread over a maximum of parents, especially in case of small populations. Contemporary N e is a better indicator of population health than census population size, as N e takes several demographic factors, such as sex ratio, mate compatibility, inbreeding and family size, into account (Carroll and Fox, 2008;Luikart et al., 2010). Recent bottlenecks are interesting to detect, as they may indicate population declines through stochastic demographic or environmental events (partial destruction of the populations, harsh climatic season) and reduce N e (Peery et al., 2012). However, sibship, N e and bottlenecks have been still overlooked in genetic assessment for plant translocations (Krauss et al., 2002;Ritchie and Krauss, 2012;Bragg et al., 2020;Gargiulo et al., 2021;Thomas et al., 2021).
Sibship, parentage and contemporary N e were inferred on seed progeny for each population (excluding CEC), using COLONY version 2.0.6.6 (Wang, 2009;Jones and Wang, 2010). The adult generation was included as candidate (maternal and paternal) parent genotypes (excluding putative clones) in the analyses. The analyses were performed using a full likelihood approach with high likelihood precision, no prior sibship size, medium length of run, and probability of an actual father or mother being included in candidates guessed at 0.8 (or 0.5 for FON and VER), and taking estimations of allelic dropout rates (null allele frequencies) and genotypic error rates into account (Wang, 2018(Wang, , 2019. The other parameters were set as by default or at 0. For N e , estimated from sibship frequency of a single cohort (in our case seed progeny generation), 95% confidence intervals were obtained from bootstrapping (Wang, 2016). Note that the sibship assignment method in species with overlapping generations gives an estimate of the effective number of breeders when applied to a single cohort, taking non-random mating (and selfing for self-compatible species) into account Wang, 2016). This is particularly useful for detecting population trends when planning translocation experiments in long-lived species. To search for possible contemporary pollen flow between Belgian populations, the analyses were also performed with all sampled adults of the six Belgian populations as candidate parent genotypes (n = 88). The best configuration (with the maximum likelihood) of the seed progeny parentage was visualized as a pedigree graph using Pedigree Viewer version 6.5f (Kinghorn and Kinghorn, 2015).

Data Analyses Based on Fitness-Related Quantitative Traits
Fitness-related quantitative traits grown under controlled conditions can vary according to genetic factors, such as individual inbreeding level (lower fitness expected for homozygous genotypes) and heterosis (higher fitness of the heterozygotes), which may affect transplant survival and reproductive success (Zavodna et al., 2015;Barmentlo et al., 2018;Van Rossum and Le Pajolec, 2021). Differences in fitness among populations or maternal plants can also indicate local processes (genetic drift, selection) leading to genetic variability and potentially differing adaptive capacities (Willi et al., 2007;Nicotra et al., 2015;Hamilton et al., 2017). Phenotypic differences may be plastic and disappear after time, suggesting that plants may adapt to changing environments and translocation sites (Gentili et al., 2018;Van Rossum et al., 2020;Van Rossum and Le Pajolec, 2021). Strong maternal effects may favor plant growth and survival in the original environmental conditions, but might impair them if environmental conditions differ, e.g., in translocation sites (Schuler and Orrock, 2012).
We performed a one-way Analysis of Covariance (ANCOVA) together with pairwise Tukey HSD post hoc tests for testing for differences in seed progeny rosette diameter in relation to population and Homozygosity by locus (HL, as an indicator of inbreeding) as a covariate. This estimate, which varies from 0 (all loci are heterozygous) to 1 (all loci are homozygous; Aparicio et al., 2006), was calculated for each individual, using the software GENHET (Coulon, 2010). CAM27, because of a high frequency of null alleles in all populations (ranging 18-58%), which might falsely inflate HL, was excluded from the analyses. Differences in seed progeny HL between populations was tested by computing a one-way ANOVA together with pairwise Tukey tests. The 18 possibly wrongly labeled MAR seed progeny (see results of the structure analysis below) were excluded from the AN(C)OVAs. We performed a Pearson's correlation analysis between rosette diameter and HL. HL was Box-Cox-transformed to achieve normality and homoscedasticity. The tests were conducted with STATISTICA.

Plastid DNA Sequencing
The sequences of trnH-psbA and matK were 393 and 908 bp long, respectively. No polymorphism was observed in either matK or trnH-psbA, so that concatenating the sequences and building a phylogenetic tree was not relevant.

Loci and Scored Alleles
We found 13 to 39 alleles per locus for a total of 376 alleles, of which 43 were private (1 found in CEC, CHA and FON populations,2 in EMO,4 in LAM,8 in VER and 26 in MAR; there was no private allele in WAT). All loci were polymorphic in all populations and in adult and seed progeny generations.

Extent of Clonal Propagation
Out of 659 sampled individuals, 643 were associated with a single multilocus genotype. The 487 seed progeny individuals showed distinct multilocus genotypes. Seven multilocus genotypes (1-3 per population) were assigned to two or three adult samples in CHA, LAM, MAR or VER, which can be considered as putative clones (p se < 0.05). For CHA, LAM, MAR and VER populations, G/N of the adult generation was 0.96, 0.88, 0.97, and 0.87, respectively, and 1.00 for the other populations. Therefore, clonal spread could be considered as marginal in our studied populations.

Genetic Variation Within Populations and Across Generations
In most populations estimates of genetic variation (A [5] , H o , H e ) were high. A lower allelic richness A [5] and/or genetic diversity (H o , H e ) in the seed progeny than in the adults was found in FON, WAT, LAM, MAR (excluding possibly wrongly labeled individuals) and VER (  (Table 2). However, inbreeding was detected for the adult generation in MAR and in the seed progeny for CHA, LAM and VER (lowest DIC for the nfb model in the IIM analysis). There was a significant correlation between flowering population size and adult F IS (Gamma correlation Γ = 0.615, P = 0.040), but the correlation was not significant anymore with adult F ISnull (Γ = −0.077, P = 0.797). The correlations between flowering populations size and the other genetic estimates (G/N, A [5] , H o , and H e ) in both generations and seed progeny F IS and F ISnull were not significant (Γ ranging from −0.619 to 0.474, P ≥ 0.060). Recent genetic bottlenecks were detected in LAM population for both adult and seed progeny generations (Wilcoxon signedrank tests Z = 2.44 and 2.10, P = 0.006 and 0.018, respectively) and for WAT seed progeny (Z = 2.78, P = 0.002). No evidence of recent bottlenecks was found for the other populations (Z ranging from −1.42 to 0.28, P ≥ 0.402).

Genetic Structure Among Populations and Between Generations
Genetic differentiation between adults and their seed progeny (within populations) was absent or low, with F ST values varying from 0.000 to 0.012 and mostly not significant (P > 0.05). By contrast, genetic differentiation between populations was moderate to high and significant (P > 0.05 after Bonferroni correction) for most population pairs (Supplementary Table S3). F ST values were higher for seed progeny than for adults (F ST values between population pairs ranging 0.035-0.154 and 0.082-0.159, respectively; Wilcoxon matched pairs test Z = 3.01, P = 0.003). Values were similar when applying the ENA correction (Supplementary Table S3).
The tree based on Cavalli-Sforza and Edwards' genetic distances with the INA correction for null alleles and the STRUCTURE analysis (best modal clusters at K = 2 and 8, Supplementary Figure S1) showed similar patterns of amongpopulation genetic differentiation, with two main clusters separating the two French populations (MAR and VER) from the Belgian populations and adult and seed progeny generations of the same population clustering together (Figure 2). At K = 8, the Bayesian clustering allowed to distinguish the populations based on high membership Q values (> 80%), except CEC whose individuals were assigned to several clusters. In FON progeny, individuals were assigned to two different clusters (dark and light brown), some individuals being admixed between the two clusters. One cluster (dark brown) was not found in the sampled adults, which suggested that it might correspond to not genotyped adults. In MAR (France), 18 seed progeny individuals appeared to belong to another cluster (light blue) typical of CHA from Belgium, with very high Q values (≥ 87%), so with no indication of admixture, and not found in the adult generation. After verification, and as these 18 individuals belonged to one same range of numbering, a possible explanation was wrong labeling during seedling pricking or juvenile repotting. For this reason, the other data analyses were performed again excluding these 18 seed progeny individuals. The barplots obtained for the other K clusters did not bring additional information (results not shown).

Contemporary Pollen Flow and Effective Population Size
Sibship and parentage assignments using COLONY showed low proportions of full sibs and half sibs and usually small full sib family sizes, so that most seed progeny individuals were not closely related in most populations, except in WAT (Table 3; Supplementary Figure S2). The number of offspring per inferred (maternal or paternal) parent genotype was small, except for a few parents and in WAT (Table 3; Supplementary Figure S2). These results indicated representative seed sampling of the maternal Full sibs and half sibs (%): percentage of full-sib and half-sib seed progeny pairs, respectively; full-sib family size: mean number of seed progeny per full-sib family, with range; seed progeny: mean number of seed progeny per maternal and paternal parent (parent 1 or 2), with ranges; sampled adults as parents: number of seed progeny whose sampled adult genotypes were assigned as maternal or paternal parent. a Mean value not calculated (only two families).
plants and high multiple paternity. N e estimated from sibship assignments ranged from 6 to 40, and was similar (EMO, LAM, WAT) or higher (CHA, FON) than flowering population size ( Table 2) in most populations, but lower for MAR and VER ( Table 3). The number of seed progeny whose at least one maternal or paternal parent was assigned from sampled adult genotypes was usually low, except for VER and WAT where candidate parents were identified as sampled adult genotypes for more than two third of the seed progeny (Table 3). Only one offspring (from EMO) had an assigned parent from sampled adults of another population (CHA). Populations of similar flowering population sizes (e.g., CHA and LAM) significantly differed from each other (Tukey tests, P < 0.001; Figure 3A). There were significant differences between populations in HL (R² = 0.072, F (6,477) = 6.16, P < 0.001), with VER showing higher HL values than WAT, LAM, EMO and MAR (Tukey tests, P ≤ 0.016; Figure 3B), but there was no significant correlation between HL and flowering population size (Γ = 0.158, P = 0.636).

No Evidence of Distinct Genetic Lineages Related to Past Events
The lack of polymorphism in the sequenced plastid DNA markers suggests no strong genetic divergence among populations related to past phylogeographic processes. Consequently, the existence of divergent, reproductively isolated genetic lineages amongst the studied populations of C. glomerata is unlikely. A similar pattern was also reported for Arnica montana (Van Rossum and Raspé, 2018) and in the bird seed-dispersed Juniperus communis (Jacquemart et al., 2021), both being critically endangered in southern Belgium. By contrast, distinct genetic lineages were found for A. montana (Vera et al., 2020) and for Jacobaea auricula (Bobo-Pinilla et al., 2021) in Spain, and reproductively isolated genetic lineages were found for Silene nutans in southern Belgium (Martin et al., 2017).

Genetic Status of the Remaining Populations
Unlike Arnica montana, Dianthus deltoides and Linnaea borealis, whose small populations were found to be highly clonal and showed low genetic variation (Wiberg et al., 2016;Van Rossum and Raspé, 2018; Meeus et al., 2012). Given the high longevity (25-30 years) of the plants, which may also subsist as non-sprouting rhizomes in the soil (Bachmann and Hensen, 2007), we may assume that the adults consist of old genets still retaining pre-disturbance genetic diversity (Van Geert et al., 2008;Thomas et al., 2021). Besides, most populations showed no inbreeding in the adult generation, as expected for species characterized by a selfincompatible breeding system (Charlesworth, 2006;Dudash and Murren, 2008). Five populations showed a slight loss of allelic richness or genetic diversity in the seed progeny generation compared to the adults, and three populations showed low but significant inbreeding levels in seed progeny, suggesting that the processes (genetic drift effects, crosses between relatives leading to biparental inbreeding) usually observed in fragmented populations (e.g., Aguilar et al., 2008Aguilar et al., , 2019Berjano et al., 2013;Wiberg et al., 2016) may have started. There was also evidence of recent genetic bottlenecks in LAM and WAT populations, suggesting that these populations may have incurred demographic reductions (Peery et al., 2012), as found for the self-incompatible Sesili farrenyi (Garcia-Jacas et al., 2021). No effect of flowering population size was observed on the genetic variation estimates, but the size range may be too narrow to detect any relationship.

Extensive Contemporary Pollen Flow Within Populations but Limited Among Populations
Adults and seed progeny generations were genetically similar (low F ST values) in all studied populations of C. glomerata.
The small size of full-sib and half-sib families for most parents indicated high multiple paternity, and contemporary N e estimates were similar or even higher than flowering population sizes for most populations. These findings suggest extensive contemporary pollen flow within populations. Flowers of C. glomerata represent an important resource in nectar and pollen for many insects and so attract a variety of pollinators, such as bumble bees, solitary bees, honey bees and hoverflies (Albrecht et al., 2007;Denisow and Wrzesień, 2015;Strzałkowska-Abramek et al., 2018), which may favor extensive pollen flow across the whole population (Ghazoul, 2005;Van Rossum, 2010;DiLeo et al., 2018). High multiple paternity was also found in an isolated, but large population of the self-incompatible Arabidopsis halleri, with only 4% of offspring sharing the same father (Llaurens et al., 2008), while a value of 20% was found in a population of 96 flowering plants of the self-incompatible Centaurea corymbosa (Hardy et al., 2004). By contrast, the high genetic differentiation among populations (for both adult and seed progeny generations) indicated by the moderate to high F ST values, and the lack of admixture indicated by genetic structure and parentage analyses based on nuclear microsatellite markers suggest that contemporary gene flow may be limited or marginal among the Belgian populations, despite the short geographic distances separating them (< 3.6 km). Although seed dispersal efficiency is known to be low, barely over a few meters (Bachmann and Hensen, 2007), such pattern is surprising concerning pollen dispersal, given flight and pollen dispersal abilities reported over such distances for some pollinators of C. glomerata (Ghazoul, 2005;Van Rossum, 2010;Zurbuchen et al., 2010). It suggests the existence of dispersal barriers between populations or that small flowering size of the populations reduced the probability for pollinators to encounter the neighboring populations, so that maintenance of constancy in pollinator foraging behavior (especially for bumble bees and honey bees) can be compromised (Ghazoul, 2005;Triest, 2010, 2012;DiLeo et al., 2018).
However, a low number of sampled adults could be assigned as parents of the seed progeny in most populations, suggesting that the parentage analyses might lack accuracy. Several factors may have limited the accuracy of a parentage analysis: the 15 microsatellite markers might not have been sufficiently informative (Jones and Wang, 2010); the sampling of adults might not have been exhaustive enough (Hardy et al., 2004); sampling of the seed cohort in 2012 and of adults in 2013 for two populations might have led to discrepancies between generations, related to fluctuating population sizes, disturbance (mowing, bottlenecks) or weather conditions affecting pollen production Hensen, 2006, 2007;Denisow and Wrzesień, 2015).

Fitness of the Seed Progeny
Reduced plant fitness is usually expected in small populations due to genetic erosion, inbreeding and deteriorated habitat conditions, especially in the seed progeny (e.g., Vergeer et al., 2004;Zavodna et al., 2015;Aguilar et al., 2019). In the studied populations of C. glomerata, inbred progeny plants (with high HL values) were indeed smaller, despite good cultivation conditions, suggesting some inbreeding depression (related to biparental inbreeding) at individual level (Willi et al., 2005;Barmentlo et al., 2018). Surprisingly, plants in small populations tended to be larger, but populations also differed in plant size apart from population size, suggesting maternal effects at early growth stages (Roach and Wulff, 1987) and high genetic adaptive variability (Basey et al., 2015;Hamilton et al., 2017). Moreover, the degraded nutrient-poor calcareous grasslands of Western Europe are usually reforested and eutrophicated (e.g., Betz et al., 2013;Hautekèete et al., 2015), leading to increased edge effects and competition in closing vegetation (Jacquemyn et al., 2003;Oostermeijer et al., 2003;Maurice et al., 2012). Plant height of C. glomerata was reported to increase under competition in dense cover vegetation (Bachmann et al., 2005), and the sampled populations also differed in local ecological conditions, e.g., they occurred on sandy or marly soils (Godefroid et al., 2016a). Therefore, the observed differences in plant size and maternal effects on progeny at early growth stages may reflect local adaptation of old, resilient genotypes (Oostermeijer et al., 2003;Willi et al., 2007), which might lead to maladaptation if translocation is implemented in restored nutrient-poor grasslands (Reckinger et al., 2010;Schuler and Orrock, 2012). Differences in plant growth, however, might also reflect phenotypic plasticity (Nicotra et al., 2015;Gentili et al., 2018), so that among-source population differences might disappear after translocation and in the newly established generations (Van Rossum et al., 2020;Van Rossum and Le Pajolec, 2021).

Implications for Translocation Design
The small populations of C. glomerata investigated in this study presented a good potential as providers of a highly diverse genetic pool (at both neutral and adaptive traits), with multiple parentage among the seed progenies, and so can be considered as good candidates as seed sources for translocations. As populations are differentiated from each other, each population only contains a fraction of species' regional genetic diversity. Therefore, mixing the sources will not only optimize the number of variants and of compatible mates in the translocated populations, but also representativeness of species regional genetic diversity.
Genetic diversity poses no immediate threat for the persistence of these populations, but small effective population sizes (and so a limited number of compatible mates) indicate that populations are vulnerable to stochastic extinction (Carroll and Fox, 2008). Moreover, only one of the Belgian populations is located in a protected area (CEC), while the others occur on roadsides or along a quarry exploited for sand (EMO), so that they are particularly vulnerable to anthropogenic disturbances. Alerting the roadside managers to the presence of such valuable populations for critically endangered species is essential. Given the evidence of the onset of processes leading to genetic erosion, inbreeding and inbreeding depression, management measures need to be implemented to counteract these trends. Therefore, population size has to be increased through habitat restoration facilitating recruitment, flowering and pollination, and gene flow has to be restored, by identifying and removing the possible barriers to pollen flow, and reconnecting populations by biological corridors or stepping-stone populations (Van Rossum and Triest, 2012;Ottewell et al., 2016). Creating new populations through translocations in protected and restored areas will also contribute to safeguarding the species' regional genetic diversity and to improve its regional conservation status.

CONCLUSIONS
The present study represents a good illustration of the use of genetic tools combining neutral nuclear and plastid molecular markers and adaptive fitness-related quantitative markers to assess the population genetic status of a critically endangered species in the framework of a restoration program involving plant translocations. In particular, estimating parentage, sibship and N e of the source material intended to be used for translocations can be a useful approach for assessing the genetic quality of seeds and contemporary pollen flow processes.

AUTHOR CONTRIBUTIONS
FVR conceived and designed the research, analyzed the data, and wrote the manuscript. OR conducted the plastid analyses. SLP conducted the germination and growth experiment. CG conducted the microsatellite analyses and genotyping. All authors critically reviewed the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This study was funded by the European Union LIFE+ Nature and Biodiversity Program (Project No. LIFE11 NAT/BE/001060).

ACKNOWLEDGMENTS
We thank the Département de la Nature et des Forêts (Service Public de Wallonie) and Natagora for giving access to the study sites and authorization to collect plant material, S. Godefroid for help in collecting leaf material and rosette diameter measurement, P. Bardin for collecting material in MAR population, F. Vandelook for seed progeny propagation in the greenhouse, A. Destombes and S. Contreras (Genoscreen, France) for microsatellite development, P. Asselman for help with primer pair selection, W. Baert for plastid molecular analyses, S. Gallina (Evo-Eco-Paleo, Univ. Lille, France) for the STRUCTURE analyses that were carried out using Structure_threader, and R. Gargiulo and two reviewers for constructive comments on the manuscript.