Genetic diversity, population structure, and phylogenetic relationships of a widespread East Asia herb, Cryptotaenia japonica Hassk. (Apiaceae) based on genomic SNP data generated by dd-RAD sequencing

Single-nucleotide polymorphisms (SNPs) represent the most prevalent form of genomic polymorphism and are extensively used in population genetics research. Using dd-RAD sequencing, a high-throughput sequencing method, we investigated the genome-level diversity, population structure, and phylogenetic relationships among three morphological forms of the widely distributed taxon Cryptotaenia japonica Hassk., which is native to East Asia. Our study aimed to assess the species status of C. japonica according to its genetic structure and genetic diversity patterns among 66 naturally distributed populations, comprising 26 C. japonica f. japonica, 36 C. japonica f. dissecta (Y. Yabe) Hara and 4 C. japonica f. pinnatisecta S. L. Liou accessions. Based on genomic SNP data generated by dd-RAD sequencing, we conducted genetic diversity, principal component, neighbor-joining (NJ) phylogenetic, admixture clustering, and population differentiation analyses. The findings revealed the following: (1) 5,39,946 unlinked, high-quality SNPs, with mean π, H O, H E and F IS values of 0.062, 0.066, 0.043 and −0.014, respectively, were generated; (2) population divergence was unaffected by isolation through distance; (3) six main distinct regions corresponding to geographic locations and exhibiting various levels of genetic diversity were identified; (4) pairwise F ST analysis showed significant (P < 0.05) population differentiation in 0%–14% of populations among the six regions after sequential Bonferroni correction; and (5) three migration events (historical gene flow) indicated east‒west directionality. Moreover, contemporary gene flow analysis using Jost’s D, Nei’s G ST, and Nm values highlighted the middle latitude area of East Asia as a significant contributor to genetic structuring in C. japonica. Overall, our study elucidates the relatively low genetic differentiation and population structure of C. japonica across East Asia, further enhancing our understanding of plant lineage diversification in the Sino-Japanese Floristic Region.


Introduction
Genetic diversity is one of the important pillars of biodiversity, and high genetic diversity increases wild plants' ability to survive and reduces the risk of extinction for species and allows the prediction of species fitness based on the study of genetic diversity (Phillips et al., 2012;Hu et al., 2022).In addition, genetic differentiation and gene flow are important elements in understanding the evolutionary and adaptive potential of populations, while high gene flow reduces the incidence of inbreeding and population differentiation by increasing the exchange of genetic material between populations (Waqar et al., 2021).Plant genetic diversity is influenced by seed dispersal, reproductive systems, life history, geographic range and evolutionary history.
The Sino-Japanese Floristic Region of East Asia harbors the most diverse temperate flora in the world and was the most important glacial refuge for Tertiary representatives ('relics') throughout Quaternary ice-age cycles (Qiu et al., 2011).Cryptotaenia DC (Apiaceae) is a polyphyletic genus with three species in the tribe Pimpinelleae and four in the tribe Oenantheae (Spalik and Downie, 2007).Cryptotaenia japonica Hassk. in the tribe Oenantheae is present in regions that are important glacial refuges: C. japonica is endemic to East Asia (China, Japan and the Korean Peninsula) (Spalik and Downie, 2007).In Japan, C. japonica is known as "Mitsuba" and is used as a condiment (seeds) or a garnish (tender leaves) (Okuno et al., 2017), and in China, it is known as "Ya-er-qin" and is used as a tonic to strengthen the human body and as a vegetable (WU et al., 2014).It is treated as a species with three similar forms, C. japonica f. japonica, C. japonica f. dissecta (Y.Yabe) Hara, and C. japonica f. pinnatisecta S. L. Liou, because it is a distinctive, widespread taxon exhibiting almost continuous variation in leaves and inflorescences across its range (Pan and Watson, 2005).
In the present study, we used SNP markers from dd-RAD sequencing, admixture clustering, principal component analysis (PCA), neighbor-joining (NJ) phylogenetic analysis and gene flow methods to comprehensively investigate the mechanisms underlying the genetic diversity and distribution differences of C. japonica across a large spatial scale in East Asia.Thus, we aimed to (1) elucidate the population structure and intraspecific divergence of the lineages of C. japonica Hassk.(2) To adequately reveal the spatial pattern of genetic diversity among identified geographical regions, and (3) to provide a reference for future research on the diversity of widespread plants in Asia.

Taxon sampling and identification of samples
South, West and East China, the Korean Peninsula and Japan were considered the three major distribution regions of C. japonica Hassk.Fresh leaves, especially young leaves, were collected from 1-6 individuals at each location and desiccated in silica gel.The dry leaves were stored at −20 °C until use.Vouchers were deposited in the Herbarium of the Institute of Botany, Jiangsu Province and the Chinese Academy of Sciences (NAS).We identified the samples as subspecies using previously described methods (Shan and Sheh, 1985;Liou, 1990).In this study, a total of 179 wild individuals representing 26 populations of C. japonica f. japonica, 36 populations of C. japonica f. dissecta (Y.Yabe) Hara, 4 populations of C. japonica f. pinnatisecta S. L. Liou and 1 population of C. canadensis were collected (Figure 1).The 176 samples we studied were collected from 66 locations in 3 countries across the entire natural range of this species, including Mainland China, Taiwan Island, Korean Peninsula and Japan Islands.Our samples of C. japonica included all the proposed forms.In addition, 3 individuals of C. canadensis (L.) DC. from 1 wild population were used in the study as an outgroup (Table 1; Figures 2, 3).

DNA extraction and quality control
Total genomic DNA was isolated from leaves using a TIANGEN DP305 DNA extraction kit.Extracted DNA was purified with Qiagen DNA Purification Kits (Oiagen, Inc., Valencia, CA, United States).The concentration and quality of the purified DNA samples were checked by a OneDrop ™ OD-1000 + spectrophotometer (www.onedrop.cn)and 1% agarose gel electrophoresis.Finally, 179 of the 212 collected DNA samples met the minimum quality requirements in this study and were ready for subsequent sequencing and genotyping.

Double digest restriction site associated DNA sequencing (ddRAD-seq) library preparation and sequencing
Only when the quality of the total genomic DNA was >100 ng/ μL could it be used for the ddRAD-seq library.The libraries of each individual were sequenced by Genepioneer Biotechnologies Co., Ltd.
(Nanjing, China) via an Illumina NovaSeq 6,000 System sequencer and VAHTS Universal DNA Library Prep Kit for Illumina (ND607, bio.vazyme.com).The enzymes EcoRI (Read1, G^AATTC) and NlaIII (Read2, Hin1II, CATG^) were used for DNA digestion, and the fragments were ligated to a barcode adaptor and a common adaptor with compatible sticky ends.The target fragments were kept within the range of 300-500 bp in size.Then, 150 bp paired-end reads were generated, and approximately 216 GB raw data of 179 individuals was generated.
A quality control pipeline was used to process the raw data by FASTP software (version: 0.20.0), and its own script was used to filter the raw data to obtain clean data.The parameter was set to q 5n 5 (Chen et al., 2018).Then, the function TRIMFQ in SEQTK software (version: 1.3-r106) was used to obtain high-quality data (https://github.com/lh3/seqtk.git).
The software pipeline STACKS (Catchen et al., 2011;Catchen et al., 2013;Paris et al., 2017;Rochette and Catchen, 2017;Rochette et al., 2019) was used to process the filtered high-quality data from the ddRAD data.We genotyped and once again identified the loci from short-read sequences by using the STACKS 2.59 pipeline.Briefly, all sequences were processed by USTACKS (--deleverage -m = 3 and -M = 4 -p 10 -t gzfastq -d-R) for RAD tags, CSTACKS (-n 4 -p 40) was used to construct a catalog of consensus loci including all the loci from D. japonica samples and merge all alleles together, SSTACK was used to compare against the catalog with sets of stacks that were created in USTACKS, TSV2BAM was used to transfer tsv files to BAM files, we aligned the reads to the locus and called SNPs by GSTACKS, and POPULATIONS (populations -P stacks--popmap popmap.list-p 2 -t 25 --write-random-snp--vcf-fasta-samples--fasta-loci) was used to filter the catalog of reads to produce a dataset for subsequent analyses.After STACKS processing, we obtained 4,504,050 SNPs.VCFtools software (version: 0.1.16)(Danecek et al., 2011) was used to filter the loci based on 1) --min-alleles 2, --max-alleles 2; 2) --max-missing 0.50; 3) --mac3; and 4) --minDP 3. Finally, a total of 539,946 SNPs were obtained after VCFTOOLS processing.

Population genetic diversity, geographical patterns, region structure, and admixture
We calculated the nucleotide diversity (π), observed heterozygosity (H O ), expected heterozygosity (H E ), and  Three methods were used to infer overall patterns of population genetic structure.Firstly, all the data were used for phylogenetic inference by MEGA (version: 11.0.7)(Tamura et al., 2021) to obtain phylogenetic trees based on the NJ method.Secondly, we used GCTA software (Genome-wide Complex Trait Analysis, version: 1.93.2) (Yang et al., 2011) for PCA based on the function (--make-grm--autosome).The GRM of GCTA was used to estimate genetic relationships among  (Alexander et al., 2009), which implements a model-based approach to infer populations and individuals in a maximumlikelihood (ML) framework.ADMIXTURE outperforms several other software programs in terms of analysis efficiency for genomewide SNP data, which are sometimes large.We estimated ancestry coefficients for every individual in 10 replicate software runs for each of K = 1-20.Then, we estimated two replicate runs of 50-fold crossvalidation for K = 1-20 to determine the potential error in each K.The K value with the lowest cross-validation error was the best K.Each individual in this study was assigned to a cluster for the best K model.

Gene flow analysis
Three methods were used to analyze the gene flow between different regions of C. japonica in East Asia.Firstly, TREEMIX (version: 1.13) (Pickrell and Pritchard, 2012) was used to infer  splitting and mixture patterns among populations of C. japonica.Migration events (m) from 0-12 were specified between populations, and 10 iterations per m were tested.In this analysis, the "-bootstrap 1,000", "-m 0-12" and "-root" parameters were used to construct an ML tree.Secondly, we ran BA3-SNPs-autotune (version: 2.1.2) in BAYESASS3-SNPs (version: 1.1) (Wilson and Rannala, 2003;Mussmann et al., 2019) with the default delta values for allelic frequency, migration rates, and inbreeding coefficients by using the data of six regions, with C. canadensis (L.) DC. as the outgroup, based on -r 10-g 10,000-b 1,000 parameters.Thirdly, diveRsity software (version: 1.9.90) (Keenan et al., 2013) of the R package (version: 3.6.2) was also used to analyze the relative direction of the gene flow between these six regions by calculating Jost's D, Nei's G ST , and Nm.The 95% confidence intervals were calculated from 1,000 bootstrap replicates to test for asymmetric flow (significantly higher in one direction than in the other).

RAD-seq data and SNP filtering
Approximately 742 million (741, 584, 140) raw reads were produced from 179 sampled individuals of Cryptotaenia.After

Population variation
Using genome-wide SNP markers, we detected some levels of genetic diversity in C. japonica from East Asia.The values of π ranged from 0.016 to 0.120, and the mean was 0.062; the values of H O ranged from 0.020 to 0.158, and the mean was 0.066; the values of H E ranged from 0.012 to 0.094, and the mean was 0.043; and the values of F IS ranged from −0.095 to 0.051, and the mean was −0.014, including F IS values of 42 populations that were negative (Table 2).

AMOVA and mantel test results
For the 57 natural geographical populations of C. japonica, analysis of molecular variance (AMOVA) revealed that most (55.55%) of the observed genetic variation was among the populations (Table 3).However, 44.45% of the total genetic diversity was attributable to within-individual local population variation (p < 0.01), consistent with the significant genetic differentiation among these 57 populations.
To investigate whether geographic distance contributed to the observed genetic differentiation among the 57 populations, we determined the relationship between geographic distance and genetic distance for all pairs of populations analyzed here using the Mantel test method.The results indicated that there was no significant correlation between genetic distance and geographic distance among the 57 populations (r = −0.004887,P = 0.534) (Figure 4).The results suggested that spatial distance was not an important factor shaping the genetic structure among wild populations of C. japonica.No isolation by distance (IBD) pattern was found in 57 populations (Figure 4).

Genetic structure and geographical patterns
To further determine the phylogenetic relationships among these populations, we constructed NJ trees based on SNPs.The NJ trees revealed polyphyly in C. japonica, which could be separated into two clades with moderate posterior support (Figures 5A, B). C. canadensis, representing the outgroup (OUTG) lineage, diverged first from other lineages.Geographically, there was a propensity for serial divergence in C. japonica from southern to northern East Asia.One clade included the SC (South China), NA (Northeast Asia), and TW (Taiwan islands) populations, and the other clade included the EC (East China), WC (West China), and CC (Central China) populations (Figures 2, 5).
The SC region includes most of southern China (part of Fujian, Guangdong, and Guangxi Provinces) and part of Jiangxi, Hunan and Chongqing Provinces of China, with 19 populations (bootstrap support 100%).The NA region includes part of Northeast China, the Korean Peninsula, and the Japanese Islands, with 8 populations (bootstrap support 100%).The TW region includes the Taiwan Province of China, which has 3 populations (with 34% bootstrap support).The EC region includes parts of Hubei, Anhui, Jiangsu, Zhejiang, and northern Fujian Provinces in China, with 11 populations (47% bootstrap support).The WC region includes part of the Henan, Hubei, Guizhou, Gansu, Shaanxi, Sichuan, and Yunnan Provinces of China, with 13 populations (bootstrap support 57%).Specifically, the CC region includes parts of Jiangxi, Henan and Shanxi Provinces of China, with 3 populations (bootstrap support 22%), where the JXJG population is located in southern Jiangxi Province.The individuals of C. japonica in the SC region had longer branches than the other individuals in the other five regions (Figure 5B).
All the individuals contained in the population are grouped together with their corresponding populations in the CC, WC, and Mantel tests between genetic distance and geographic distance among the C. japonica populations.
EC regions.However, we also found that individuals 68-3 (TWYL) in the TW region and 53-1 (GDYS), 39-1, 38-3 (HNSY) and 45-2 (JXJJ) in the SC region did not cluster with other individuals in the population.The individuals from the RBCY (79) and RBQT ( 78) populations are admixed in the NA region branch.In the phylogenetic tree, we still found that the above individuals were still near the respective populations in their regions (Figures 5A, B).The CC region contains only one form, C. japonica f. dissecta, while WC, EC, TW, and NA contain two forms: C. japonica f. dissecta and C. japonica f. japonica.Similarly, the SC region has three forms: C. japonica f. dissecta, C. japonica f. japonica and C. japonica f. pinnatisecta (Table 1; Figure 6A).
The HSTK population in the EC region has C. japonica f. dissecta (02) and C. japonica f. japonica (01 and 03).The GXLS population in the SC region has japonica f. japonica (32) and C. japonica f. pinnatisecta (33).The GDSG population includes C. japonica f. dissecta (50), C. japonica f. japonica (48) and C. japonica f. pinnatisecta (49).Individuals of different forms originating from the same population are still clustered together in the phylogenetic tree.
The GXHZ population has C. japonica f. dissecta (36), C. japonica f. japonica (35) and C. japonica f. pinnatisecta (34).The HNSY population has C. japonica f. dissecta (38) and C. japonica f. japonica (39).In the GXHZ and HNSY regions, some individuals originating from the same population did not cluster together on the phylogenetic tree, but they still corresponded to individuals from the same population in separate parts and were clustered together, regardless of the form to which they belonged (Table 1; Figure 5A).

PCA and bar plot of ancestry coefficient results
PCA was performed on six regions, and PC1 vs PC2 identified six groups and explained 17.45% and 11.92% of the variation, respectively (Figure 6A).The EC, CC and WC regions clustered closely together.The TW and NA regions clustered closely together.SC formed one separate group.However, the WC region still had populations embedded in the EC region, while the other populations formed separate groups.The PCA indicated that the genetic relationships among the EC region, CC region and WC region were relatively close.The relationships among these regions were similar to those in the NJ tree results (Figures 5A, B).
To understand the regional genetic structure of the six regions, we used 47,033 unlinked SNPs for STRUCTURE analysis.At K = 2, the NA and CC regions first diverged from the other regions and displayed two independent clusters.This indicates that both of them have relatively independent genetic backgrounds.At K = 3 to 5, the SC region diverged secondarily and had little genetic admixture.Although the optimal K determined for ADMIXTURE analysis was 7, it is noteworthy that the ADMIXTURE outcomes at K = 6 genetic clusters aligned closely with the findings from PCA and NJ analyses (Figures 6B, C).At K = 6, the admixtures of the individuals in the NA and TW regions and the individuals in the EC and WC regions were much closer.However, it is distinct from those of the other regions.In summary, for the relationships between these regions, all three analyses shown above demonstrated similar patterns (Figures 6B, C).

Genetic diversity and population structure
Based on the six regions, we detected significantly different levels of π between the NA region and the EC region (p = 0.019 < 0.05) and between the NA region and the SC region (p = 0.029 < 0.05) based on Kruskal-Wallis tests.We detected significantly different levels of H O between the NA region and the EC region (p = 0.012 < 0.05) and between the NA region and the SC region (p = 0.014 < 0.05) based on Kruskal-Wallis tests.We detected significantly different levels of H E between the NA region and the EC region (p = 0.011 < 0.05) and between the NA region and the SC region (p = 0.027 < 0.05) based on Kruskal-Wallis tests.There were significantly lower levels of F IS in the SC region (F IS = −0.031)than in the NA region (F IS = 0.008) (p < 0.05), TW region (F IS = 0.022) (p < 0.05), and WC region (F IS = −0.009)(p < 0.05).Furthermore, the F IS values in the TW region (F IS = 0.022) were significantly greater than those in the EC region (F IS = −0.019)(p < 0.05) based on ANOVA.Based on Nei's G ST and Nm values, we obtained the same gene flow patterns (Figures 8B, C).The results showed that there was more significant gene flow within mainland East Asia than within the Korean Peninsula, Japanese Archipelago and Taiwan Island and relatively limited migration between them.The main reason may be the isolation between islands and the mainland during species distribution, resulting in a lack of opportunities for gene exchange between them.The SC region also showed lower imported gene flow, and the EC region appeared at the center of the migration network.The obvious difference between gene flow patterns A, B, and C was the absence of significant asymmetric gene flow from the TW region to the EC or WC region.

Discussion
The study of genetic diversity and population structure in widespread plants, such as Quercus cerris L. in Europe (Lados et al., 2024), Allium macrostemon Bunge in Japan (Probowati et al., 2023), and Capsicum pubescens Ruiz & Pav. in America (Palombo and Carrizo García, 2022), provides valuable insights into the factors influencing genetic differentiation.These factors include selection pressure, gene flow, and life history (Gamba and Muchhala, 2020).For long-lived species, gene flow among populations can significantly mitigate genetic erosion caused by habitat fragmentation (Fuller and Doyle, 2018).Conversely, shortlived herbs, with their rapid generational turnover, may experience a more pronounced decline in genetic diversity under similar fragmentation conditions (Young et al., 1996).Moreover, plants with abiotic-mediated pollination and seed dispersal are less susceptible to habitat fragmentation than those relying on animal-mediated mechanisms (Sato and Kudoh, 2014;Fontúrbel et al., 2015).
In the case of Cryptotaenia japonica, an annual or biennial herb found in damp forest areas, streams, and ditches, our AMOVA analysis revealed nearly equal genetic variation between and within populations (55.55% and 44.45%, respectively) (Table 3).This finding is consistent with the low and nonsignificant π, H O , H E , and F IS values between populations and pairwise F ST values, indicating low genetic differentiation, minimal geographic structure, and high gene flow among C. japonica populations  across the studied regions (Tables 1, 3, and 4).Typically, outcrossing species exhibit higher genetic diversity compared to selfing species (Nybom, 2004).The negative inbreeding coefficients (F IS ) observed for most populations (Table 2) suggest a lack of common inbreeding within C. japonica populations, which results in lower genetic differentiation but also lower genetic diversity.Our analysis revealed no clear genetic separation based on geographic distribution for C. japonica populations (Figure 4).Pairwise F ST values between populations across regions correlated weakly with geographic distances (Supplementary Tables S1, S5, S6), implying that populations within the same region share similar genetic backgrounds.This pattern is likely due to continuous and similar habitat preferences within geographic units, allowing high gene flow across relatively flat and expansive areas (Mims et al., 2016).
Previous studies have shown that water can significantly aid seed dispersal for wind-dispersed species across fragmented landscapes (Soomers et al., 2012;Yuan et al., 2022).We hypothesize that water flow may similarly facilitate the seed dispersal of C. japonica.Although this hypothesis remains speculative due to the lack of detailed morphological and experimental evidence, the genetic evidence presented here confirms that C. japonica has sufficient seed dispersal capabilities to maintain moderate-to-high levels of gene flow and population connectivity over large spatial and temporal scales.
The discontinuous distribution of C. japonica between mainland China and Taiwan, as well as between mainland China and Japan, can be attributed primarily to geological and climatic history.During the glacial periods of the Pleistocene epoch, lower sea levels resulted in land bridges connecting the Asian mainland to Taiwan and Japan, facilitating plant migration across these regions.Post-ice age sea level rise subsequently submerged these land bridges, isolating mainland plant populations from those on the islands (Harrison et al., 2001;Qiu et al., 2011).Climatic changes, particularly the East Asian Monsoon system, which brings wet summers and dry winters, have also significantly influenced plant distributions.The monsoon's variable impact across the region, due to  topographical differences, has led to diverse microclimates that support genetic differentiation (Qiu et al., 2011).Gene flow events among the six regions were identified using TreeMix (Figure 7).Notably, strong gene flow was observed from the TW region to the EC region, as indicated by Jost's D analysis (Figure 8A).The EC region emerged as a central hub for gene flow based on diveRsity gene flow analysis (Jost's D, Nei's G ST , and Nm), suggesting its role as a genetic "bridge" throughout the East Asian distribution area.
Overall, our findings highlight the complex interplay between genetic diversity, gene flow, and geographic factors in shaping the population structure of C. japonica.The observed high levels of gene flow and low genetic differentiation suggest that C. japonica has maintained connectivity across its range, despite historical and contemporary geographical barriers.This study provides a valuable reference for future research on the genetic diversity of widespread plants in Asia and underscores the importance of considering both historical and contemporary processes in understanding plant population dynamics.

Conclusion
This study provides a comprehensive analysis of the genetic diversity and population structure of Cryptotaenia japonica across East Asia.The findings reveal substantial genetic differentiation among populations, with significant variation in genetic diversity metrics across different regions.The lack of isolation by distance suggests that historical and ecological factors may be more influential in shaping the genetic structure of C. japonica populations.These insights contribute to our understanding of the genetic dynamics of widespread plant species in Asia and provide a foundation for future conservation and research efforts.

FIGURE 2
FIGURE 2 Sampling points for C. japonica populations.The pink dots show the populations in the NA region, the yellow dots indicate the TW region, the dark blue dots indicate the EC region, the green dots indicate the CC region, the sky blue dots indicate the WC region, and the white dots indicate the SC region.The map was produced by GeoMapApp (http://www.geomapapp.org/).

FIGURE 3
FIGURE 3Distribution of forms across elevations.Blue dot = C. japonica f. japonica; yellow dot = C. japonica f. dissecta; white dot = C. japonica f. pinnatisecta.The size of the dot is larger, and the altitude is greater on the map.The map was produced by GeoMapApp (http://www.geomapapp.org/).

FIGURE 5
FIGURE 5 Neighbor-joining (NJ) phylogenetic tree of 176 C. japonica plants with C. canadensis as the OUTG.(A) Consensus NJ-tree with bootstrap values, (B) NJ-tree with branch length.The three color gradient bars in (A) show 3 forms of C. japonica for each individual, where blue indicates C. japonica f. japonica, orange indicates C. japonica f. dissecta, and red indicates C. japonica f. pinnatisecta.The color scheme in (B) also corresponds to the highlights of each branch in (A) and the dot color in Figure 2.
FIGURE 6 (A) Principal component analysis (PCA) displaying the first two axes (PC1 and PC2).(B) Values of the cross-entropy criterion for numerous clusters ranging from K = 1 to 20. (C) Bar plot of ancestry coefficients for K = 2 to 7. The dtailed information of six regions could be found inTable1, 2 and Figures 2.

FIGURE 7 (
FIGURE 7 (A) TreeMix graph of the relationships among the sampled C. japonica regions.Migration events corresponding to directional gene flow are indicated by arrows.The arrow color indicates the migration weight, with darker orange indicating a stronger genetic effect on the destination region.(B) Scaled residuals from the fit of the model to the data.
FIGURE 8 (A) The migration network for C. japonica among different geographic regions based on Jost's D values; (B) The migration network for C. japonica among different geographic regions based on Nei's G ST values; (C) The migration network for C. japonica among different geographic regions based on Nm values.The arrows indicate the direction of gene flow.The thickness of the arrow indicates the strength of gene flow between different regions, and the number indicates the relative migration value.

TABLE 1
Localities of the three different forms of C. japonica and C. canadensis populations sampled.

TABLE 1 (
Continued) Localities of the three different forms of C. japonica and C. canadensis populations sampled.
ST was used as a measure of genetic distances among pairs of populations using the POPULATION program in STACKS (version: 5.39), and the transformation F ST /(1-F ST ) was applied.Geographical distances included pairwise shortest distances among populations by using the geosphere program (version: 1.5-14) (https://github.com/phiala/ecodist.git) in the R package (version: 3.6.2).Mantel tests were used to measure the correlation between genetic and geographical

TABLE 1 (
Continued) Localities of the three different forms of C. japonica and C. canadensis populations sampled.
japonica individuals from SNP data.Finally, we used ADMIXTURE (version: 1.3.0)

TABLE 2
Genetic characteristics of the sampled populations.and trimming, 679.72 million high-quality reads were retained.A catalog containing 494, 675, 435 loci was constructed, and 28, 89, 851 loci were genotyped by GSTACKS using the datasets of all Cryptotaenia samples.The mean, minimum, and maximum values for effective per-sample coverage were 34.7×, 15.2×, and 96.1×, respectively.After filtering low-quality loci (minor allele frequency < 0.01; missing rate > 0.5), 5,39,946 unlinked SNPs were eventually identified and used for all subsequent analyses.

TABLE 3
AMOVA of genetic variation among 57 populations of C. japonica.

TABLE 5
Model log-likelihood for TreeMix models with 1-12 migration events.

TABLE 6
Matrix of inferred gene flow between genetic regions.