Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Genet., 18 December 2025

Sec. Evolutionary, Population, and Conservation Genetics

Volume 16 - 2025 | https://doi.org/10.3389/fgene.2025.1727583

This article is part of the Research TopicInsights in Evolutionary and Population Genetics: 2025View all 4 articles

Integrative linkage and recombination analysis of 25 X-STRs across 7 linkage groups using pedigree-based and SNP-based strategies

Jinglei Qian&#x;Jinglei Qian1Xiaoqin Qian&#x;Xiaoqin Qian2Qiqi JiQiqi Ji1Zhimin LiZhimin Li1Chengchen ShaoChengchen Shao1Hongmei XuHongmei Xu1Fan Yang
Fan Yang3*Jianhui Xie
Jianhui Xie1*
  • 1Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
  • 2West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu, China
  • 3Shanghai Key Laboratory of Crime Scene Evidence, Shanghai Research Institute of Criminal Science and Technology, Shanghai, China

Introduction: X-chromosomal short tandem repeats (X-STRs) are valuable genetic markers in forensic science for resolving complex kinship scenarios. However, the linkage relationship and recombination of X-STRs remain poorly characterized.

Methods: Based on high-density SNP data with relatively low mutation rates, we developed a pedigree-based method to analyze the recombination relationships between these X-STR linkage groups (LGs). We used X-STRs data from 66 two-generation families and X-SNPs data from 602 X-chromosomal SNP trios from the 1000 Genomes Project. We investigated the linkage relationships among 25 X-STR loci, grouped into seven LGs, including three identified regions (Xp21.1, Xq21.31, Xq23) and four Argus X-12 (Xp22.2, Xq12, Xq26, Xq28), provided broader coverage linkage groups of X-STR.

Results: We found strong intra-group linkage (Maximum logarithm of odds (MLOD) > 5) and near independence between groups (MLOD <1). Estimated recombination rates of X-STR data ranged from 0.0000 to 0.0487 within LGs, and from 0.1561 to 0.4133 between LGs, while the recombination fraction between the 7 linkage groups occurred in approximately 50% of informative meioses. LD decay analysis showed that R2 dropped to 0.1 at a distance of approximately 3.7 kb, supporting the feasibility of using SNP-derived LD signals to infer STR recombination patterns at fine scale.

Discussion: The family-based methods with X-SNP provide a more robust framework for evaluating X-STR linkage with the advantage of a relatively low mutation rate, high density and phased, particularly for newly developed loci lacking extensive haplotype databases.

Conclusion: These findings contribute to a more precise understanding of X-STR inheritance and enhance their reliability in forensic kinship analysis.

1 Introduction

Short tandem repeats (STRs) are widely used in forensic kinship analysis. When autosomal STRs do not provide sufficient discriminatory power for relationship inference, X-chromosomal, Y-chromosomal, and mitochondrial markers can serve as complementary tools (Gomes et al., 2020). Among them, X-chromosomal STRs (X-STRs) exhibit a distinct inheritance pattern: males inherit a single maternal X chromosome (excluding pseudoautosomal regions), effectively representing a haplotype, while females inherit one X chromosome from each parent, enabling meiotic recombination between homologous X chromosomes. This unique genetic trait makes X-STRs particularly powerful in certain specific kinship scenarios, such as in cases of missing samples (one or both parents missing), patrilineal half sibling identification, and incest cases, where traditional A-STRs or Y-STRs may yield uncertain or statistically weak results. These highlights the significant practical value of X-STR in forensic cases (Gomes et al., 2020; Shrivastava et al., 2014).

Currently, several commercial X-STR amplification systems have been released. Previous reports showed that Argus X-12 Kit adopted X-STR markers from four well-established linkage groups located in Xp22.2, Xq12, Xq26, and Xq28 regions (Nothnagel et al., 2012; Bini et al., 2019). However, the genetic information yielded by these X-STR markers with only 4 linkage groups may not be sufficiently informative for the accurate inference of specific kinship relationships, especially in scenarios involving mutational or recombination events within a linkage group (Hering et al., 2015). Expanding the number of well-characterized, independently inherited linkage groups would improve the reliability of forensic inference. Other commercial kits, such as AGCU X19 STR kit (Zhang et al., 2016), Goldeneye 17X (Tao et al., 2020), and Microreader 19X ID System kit (Xiao et al., 2021), include additional X-STR markers located outside these four regions. Given that all X-STRs are located on the same chromosome, understanding their linkage status and recombination behavior is crucial for the forensic application (Tillmar et al., 2017). The linkage relationship of new developed X-STRs were usually investigated with the LD analysis with population data and a small part of the pedigree recombination analysis with recombination rate (Rc). However, in some cases, distinguishing intra-group recombination from de novo mutations remains challenging (Yang et al., 2021). The methods have shortcomings, such as requiring a large amount of family data and difficulty in determining parental genotypes. While the physical distance between two adjacent X-STRs in a linked group is very short and low recombination rates were observed between them, only a small subset of locus pairs exhibited significant LD in the population (Sufian et al., 2017; Zhang et al., 2016; Zhang et al., 2012; Yang et al., 2019; Yang et al., 2017). For instance, DXS10074 and DXS10075 are only 0.02 Mb apart, yet they were found to be in linkage equilibrium in earlier reports (Zhang et al., 2016; Yang et al., 2019). For highly polymorphic STRs, detecting significant LD may be challenging and require a relatively large sample size. The observation that certain X-STRs are weak against LD may reflect a lack of strong associations in the population due to these evolutionary factors, or simply the limited ability of population level LD detection to capture weak associations with STRs (O'Connor et al., 2011). This discrepancy may also be due to the effects of long-term evolutionary factors, such as historical recombination, mutation accumulation, or genetic drift, which could disrupt population-level LD even between closely linked markers. In general, the linkage relationship of these additional markers is not yet clear. This uncertainty can lead to errors in likelihood ratio (LR) calculations and affect the interpretation of kinship results (Tillmar et al., 2017).

Single Nucleotide Polymorphism (SNP) derived from Whole Genome Sequencing (WGS) have been applied in genetic genealogy studies (Morimoto et al., 2016). However, issues such as cost, accuracy, and lack of standardization still limit their broader application in forensic medicine (Kling et al., 2021). Therefore, X-STRs remain the conventional choice in forensic genetic analysis. On the other hand, SNP has the advantage of relatively low mutation rate and high density. We hypothesized that integrating phased SNP data from large public repositories can overcome the limitations of traditional family studies for recombination mapping. The high-depth 1kGP data provides a clear set of phased data, which directly solves the problem of undetermined genotypes that may affect recombination rate estimation in traditional X-STR studies, thereby improving the reliability of SNP analysis.

To address the limitations of X-STR recombination research and enhance analytical resolution, we incorporated phased SNP data from 602 trios in the 1000 Genomes Project (1kGP) (Byrska-Bishop et al., 2022) and extracted the phased X-SNP haplotypes of offspring and parents in regions flanking the target X-STR loci. Compared to the Hapmap data (Phillips et al., 2012) which used undetermined phase data, we directly identified exchange events to estimate the recombination rates between X-STRs. This approach provides a novel strategy to investigate X-STR recombination using publicly available SNP pedigree data. In our previous reports, three additional linkage groups on the region of Xp21.1, Xq21.31, and Xq23 were identified based on marker discrimination power and physical positioning on the X chromosome (Yang et al., 2021). In this study, we incorporated these three linkage groups with the four previously established ones (Nothnagel et al., 2012; Bini et al., 2019) bringing the total to seven. We systematically analyzed linkage relationship among 25 X-STRs within these seven groups using both family-based X-STR data and the SNP-based approach with 1kGP trios (Byrska-Bishop et al., 2022).

2 Materials and methods

2.1 DNA sample preparation

Blood samples were collected from 66 two-generation families (Table 1) from China with informed consent. The Genomic DNA was extracted from FTA cards using the ReadyAmp Genomic DNA Purification System (Promega, United States) and the concentration was determined using NanoDrop-1000 spectrophotometry (ThermoFisher, United States). The 2800M (Promega, United States) were used as control DNA.

Table 1
www.frontiersin.org

Table 1. The information of families investigated in this study.

2.2 Data collection from 1000 genomes project

High-coverage whole-genome sequencing of the expanded 1000 Genomes Project (1kGP) cohort including 602 trios was used for the linkage and recombination research based on the location of the X-STR loci (Byrska-Bishop et al., 2022). To determine the recombination of two junction X-STRs from adjacent clusters, SNPs with minor allele frequency of ≥0.2 from 602 pedigrees in the phased 1kGP cohort were extracted (if the child is a male, the genotype data of the father is not needed) and the phased data was extracted by the SNP data was extracted and filtered from the VCF files by the ID of the trios with VCFtools (Danecek et al., 2011).

2.3 STR locus selection and primer design

Ten X-STR from three linkage groups in the region of Xp21.1, Xq21.31 and Xq23 in our previous study (Yang et al., 2021) were adopted based on the physical location in the human genome and the polymorphism in the Chinese Han population (Figure 1; Supplementary Table S1). DXS10148 was replaced by one DXSF02 in linkage group 1 (LG1) and one novel X-STR DXSF13 was added to LG3 for closer physical distance in linkage groups and accommodating more X-STRs in electrophoresis. Finally, 25 X-STRs in 7 linkage groups were adopted: LG1: DXS10135-DXS8378-DXSF02 (Xp22.31); LG2: DXSF07-DXSF08-DXSF09 (Xp21.1); LG3: DXS10079-DXSF13-DXS10074-DXS10075 (Xq12); LG4: DXSF15-DXS6803-DXSF18-DXSF19 (Xq21.31); LG5: DXSF28-DXSF29-DXSF33 (Xq23); LG6: DXS10103-DXS10102-HPRTB-DXS10101 (Xq26); LG7: DXS10146-DXS10134-DXS10147-DXS7423 (Xq28). The physical and genetic location on X Chromosome was shown in Supplementary Table S1.

Figure 1
Flowchart illustrating the process of phasing inference and recombination rate calculation using SNP and X-STR data. It begins with phased data from the 1000 Genomes project, filtering X-SNPs with a minor allele frequency of at least 0.2. The process identifies informative and uninformative families based on SNPs and calculates recombination rates. A diagram shows phased X chromosomes with target X-STR locus; matching flanking SNPs lead to haplotype assignment.

Figure 1. The workflow of recombination rates of linkage groups of X-STR with phased X-SNP data of 1000 Genomes Project.

All primer sets were designed using the Primer 5 software (PREMIER Biosoft, United States) and checked by the UCSC In-Silico PCR tool. The length distribution of the products was arranged from 100 to 450 bp and primers were labeled with one of four fluorescent dyes (5-FAM, HEX, TAMRA, and ROX). The detail was shown in Supplementary Table S2.

2.4 PCR amplification and capillary fluorescence electrophoresis

These 25 X-STR loci were amplified simultaneously in a total reaction volume of 12.5 µL containing 6.25 µL of 2× Qiagen Multiplex PCR Master Mix (Qiagen, Germany), 2 µL of primer mixture, 0.5 µL of 4 μg/μL bovine serum albumin (BSA) solution (Sangon Biotech, China), 1 µL DNA and 2.75 µL nuclease-free water according to the manufacturer’s recommendations and calibration results. PCR was performed on Mastercycler nexus GSX1 (Eppendorf, German) with the cycling condition of a temperature gradient: 5 min of initial denaturation at 94 °C; 32 cycles including 30 s of denaturation at 95 °C, 90 s of annealing at 58.5 °C and 30 s extension at 72 °C, followed by 30 min of final extension at 68 °C and stored at 4 °C. In the process of electrophoresis genotyping, 1 µL PCR product was denatured in 6 µL with a 25:1 (v/v) mixture of HiDi formamide (ThermoFisher, United States) and Size Standard Org500 (Microread, China). The mixture was then denatured at 95 °C for 4 min and cooled at 4 °C for 5 min. PCR products were separated by capillary electrophoresis on an ABI PRISM 3130xL Genetic Analyzer (Applied Biosystems, United States). Alleles of X-STRs were determined based on bins using GeneMapper ID software v3.2 (Applied Biosystems, United States) (Supplementary Figure S1). Sanger sequencing (Saiheng Biotech, China) was conducted for PCR amplification products of the new loci to confirm STR sequence structures.

2.5 Statistical analysis

Linkage disequilibrium (LD) analysis offers a essential method to assess the association between loci, though accurate LD estimation generally requires large-scale population datasets with phased haplotype information. This presents a challenge in forensic genetics, especially when validating newly developed loci. The maximum logarithm of the odds (MLOD) score is a commonly used statistical measure in linkage analysis, with higher scores indicating stronger evidence of linkage (Yoo and Mendell, 2008). The MLOD score was calculated with Mendel 16.0 using the Kosambi mapping function and marker allele frequencies with family data and no disease model was specified (Lange et al., 2013). and the score of >3.0 was considered significant linkage. To examine the LD characteristics among 7 linkage groups, we utilized LDBlockShow 1.4 (Dong et al., 2021) to visualize the local LD structure of 7 linkage groups based on X-SNPs from the 1000 Genomes Project. For each linkage group, a 10 kb upstream region of the first STR marker and a 10 kb downstream region of the last marker were included in the target region. (Gabriel et al., 2002). LD blocks were estimated using the Gabriel method based on pairwise D′ values. To investigate the linkage disequilibrium pattern of the X chromosome, we conducted an LD decay analysis. The analysis was performed using PopLDdecay version 3.42 (Zhang et al., 2019). We clarified that LD decay analysis was performed using pairwise R2 values calculated across the entire X chromosome, with a MAF threshold of 0.05. The input data was extracted from the 1000 Genomes Project.

The recombination fraction of X-STRs with family data was calculated with Recombulator-X software (Aneli et al., 2023). The recombination rates calculated by phased family data of X-SNPs and the process was shown in Figure 1. We utilized the trio structure (Mother-Son or Mother-Father-Daughter) to determine the transmission of maternal X-chromosomal haplotypes. Since males inherit a single maternal X chromosome and females inherit one X chromosome from each parent, we could unambiguously trace the origin of alleles provided the mother was heterozygous. These SNPs from upstream and downstream segments of 10 kb in length close to each X-STR were used for investigation. A 10 kb flanking window was selected to align with the typical linkage disequilibrium (LD) block structure of the human genome (Gabriel et al., 2002), balancing marker informativeness with linkage specificity based on the LDdecay result. At least one heterozygotic SNP in each segment is required to be informative in determining the transmission of chromosomal segments. If both upstream and downstream segments from the same X chromosome in mother were synchronously transmitted to offspring, the chromosomal region harboring the X-STR in mother is considered to be transmitted to offspring. When two junction X-STRs in offspring originated from different X chromosomes in mother, the recombination was considered. If all SNPs within the 10 kb flanking interval were homozygous in the mother, the parental origin of the transmitted segment could not be distinguished. If the upstream and downstream flanking SNPs indicated inheritance from different maternal X chromosomes (i.e., a recombination event occurred exactly within the region spanning the X-STR), the linkage phase of the X-STR allele itself became ambiguous. To ensure accuracy, we conservatively excluded these cases rather than inferring the STR origin probabilistically. This prevents the introduction of phase errors. Informative SNPs and recombination were analyzed with in-house Python scripts and Excel treatment. The 95% confidence intervals (95% CIs) were calculated following Binomial distribution (http://statpages.org/confint.html).

3 Results

3.1 Definition and selection of 25 X-STR markers across seven linkage groups

Based on the polymorphism of X-STRs on Xp21.1, Xq21.31 and Xq23 (LG2, LG4 and LG5 in this study) from our previous investigation (Yang et al., 2021) and the physical distances between X-STRs, a total of 25 X-STRs were selected for inclusion in this study (Supplementary Table S2). The physical distribution of 25 X-STRs and the location of these 7 linkage groups were illustrated in Figure 2. Following recommended nomenclature conventions (White et al., 1997; Novroski et al., 2019), the newly identified X-STR loci were named using the “DXSF0n” format. Loci previously labeled as “X0n” in our earlier study (Yang et al., 2021) have been renamed accordingly. In terms of repeat motif structure, DXSF13 consists of pentanucleotide repeats, whereas the remaining newly identified loci are composed of tetranucleotide repeat units (Supplementary Table S2). Notably, several alleles of DXSF02 contain an incomplete repeat motif. Alleles with the sequence structure of (ATAG)mATG (ATAG)n were defined as having an allele designation of (m + n).3.

Figure 2
Diagram of Chromosome X with specific loci and genetic markers. Loci are labeled from p22.31 to q28 with corresponding markers like DXS10135, DXS10079, and DXS10103, aligned to hg38 locations measured in megabases, with groups LG1 to LG7.

Figure 2. The physical location of 7 linkage groups with 25 X-STRs in X chromosome (the previously identified 3 LGs were shown in bold).

3.2 Dual-method linkage analysis confirms within-group consistency and between-group independence

To assess linkage relationships, the MLOD values were calculated using X-STR genotypes from two-generation family datasets (Supplementary Table S3). The results showed that MLOD scores were consistently greater than 5 within each linkage group with the range of 5.0316 (DXS8378-DXSF02) to 13.8435 (DXS10146-DXS10134) and less than 1 with the range of 0.0000 (DXSF02-DXSF07, DXSF19-DXSF28 and DXS10101-DXS10146) to 0.6474 (DXS10075-DXSF15) between different groups (Table 2), indicating strong linkage within groups and independence between groups.

Table 2
www.frontiersin.org

Table 2. The MLOD of the adjacent X-STR pairs in 7 linkage groups.

The LD patterns of all seven X-STR linkage groups are shown in Figure 3. Among them, LG2, LG3, and LG6 exhibited strong and continuous linkage disequilibrium across the ±10 kb flanking regions, with most SNP pairs showing complete linkage relationship. These patterns suggest stable historical co-inheritance and minimal recombination within these regions. In contrast, LG1, LG4, and LG7 displayed fragmented LD structures with multiple interspersed low-LD segments while LG5 showed particularly scattered LD segments probably due to the longest span.

Figure 3
Genetic linkage maps for lineages LG1 to LG7 displayed as triangular heatmaps. Each map shows colored blocks from red to yellow, representing varying degrees of linkage disequilibrium, with a color key indicating values from zero to one. Labels above each triangle provide marker identifiers and positions.

Figure 3. The LD heatmaps of seven X-STR linkage groups within 10 kb region included upstream and downstream. D ′value close to 1.0 indicates strong LD, while a value close to 0.0 indicates weak LD.

To investigate the LD characteristics of the whole X chromosome, we plotted the average pairwise R2 values of X-SNPs across different genomic distance intervals (Supplementary Table S4). The results showed that LD decayed to R2 ≈ 0.1 at an average inter-marker distance of approximately 3.7 kb, indicating that short-range SNPs on the X chromosome still exhibit a moderate level of correlation. This pattern supports the potential of using X-SNP data to estimate X-STR recombination rates with finer resolution.

Additionally, by comparing LD decay patterns of the X chromosome with those of representative autosomes including chromosome 1 (longest), chromosome 10 (medium-length), and chromosome 21 (shortest), we observed that the X chromosome exhibited a slower LD decay rate than all three (Figure 4; Supplementary Table S4). This distinct LD behavior of the X chromosome may reflect its unique inheritance patterns and effective population size.

Figure 4
Line graph depicting linkage disequilibrium decay, with R-squared values on the y-axis and distance in kilobases on the x-axis. Four chromosomes are shown: chr1 (solid red), chr10 (dotted blue), chr21 (dashed orange), and chrX (dashed green). All lines show a steep decline initially, then plateau.

Figure 4. The LDdecay curve of chromosome X, 1,10 and 21.

3.3 Dual-method recombination fraction analysis confirms within-group consistency and between-group independence

Recombination fractions were evaluated first using the same family-based X-STR data. As illustrated in Figure 5 and detailed in Table 3, intra-group recombination rates ranged from 0.0000 (DXS10134-DXS10147 and DXS6803-DXSF18) to 0.0487 (DXSF29-DXSF33). In contrast, recombination fractions between groups ranged from 0.1561 (DXS10075-DXSF15) to 0.4133 (DXSF02-DXSF07). These results demonstrate a low frequency of recombination within clusters, whereas recombination events were relatively more frequent between clusters, supporting the classification of these seven groups as independently inherited linkage groups. The recombination fractions between common adjacent X-STR pairs generally show no significant difference compared to other studies except for the one of HPRTB-DXS10101.

Figure 5
Bar chart showing recombination fractions of pairwise X-STRs. The x-axis lists different X-STR pairs, while the y-axis represents recombination fraction values from 0 to 0.5. Bars are red for inter-group recombination and orange for intra-group recombination. A dashed line at 0.05 indicates the intra-group threshold, and another at 0.5 marks the independence threshold.

Figure 5. The recombination rates of pairwise X-STRs from two-generation families in this study (the intergroup recombination rates were shown in red and intra-group recombination were shown in orange).

Table 3
www.frontiersin.org

Table 3. The recombination fraction between adjacent X-STR pairs using STR data and compared by previous studies.

Considering the limited resolution of recombination events in two-generation families, we analyzed recombination fraction with phased SNP data from 602 trios in the 1kGP (Byrska-Bishop et al., 2022). In the linkage research we observed that the flanking SNPs of certain intra-group X-STR pairs incomplete linkage and such disrupted LD patterns suggest potential historical recombination events, sparse SNP informativeness or evolutionary drift, which may influence the reliability of SNP-based recombination inference within these clusters. Given these limitations, we opted to exclude intra-group recombination estimates derived from X-SNP trio data and instead relied on pedigree-based X-STR genotypes. We restricted SNP-based inference to inter-group regions due to the scale effect. For intra-group pairs (<1 Mb), the extremely low crossover rate makes the analysis susceptible to artifacts from non-crossover gene conversion events, which can mimic crossovers in short windows (Jeffreys and May 2004; Williams et al., 2015).

The range of X-SNP searched in this study were shown in Table 4. The highest recombination fraction between X-STR linkage groups was 53.88% in LG2-LG3, while the lowest one was 45.62% in LG1-LG2. Our results showed that the recombination fraction between the 7 linkage groups occurred in approximately 50% of informative meioses (Table 5), which suggests that these 7 linkage groups should be independently transmitted in inheritance.

Table 4
www.frontiersin.org

Table 4. The range of X-SNPs and corresponding rs ID for recombination pedigree study with 1000 Genomes Project.

Table 5
www.frontiersin.org

Table 5. The recombination fraction between X-STR linkage groups using SNP data from the 1000 Genomes Project.

4 Discussion

In this study, we investigated the linkage and recombination patterns of 25 X-STRs within seven linkage groups. Our findings, supported by MLOD analysis, recombination rates and X-SNPs investigation using the data from 1kGP, showed that these 7 linkage groups were shown to be transmitted independently and show considerable application prospects in the forensic medicine.

Although X-STRs can serve as a complementary tool for autosomal markers, the occasional recombination within clusters in allele transmission gives rise to challenges when evaluating the weight of evidence. The physical length of each cluster in this study is less than 1 Mb (genetic distance of ≤1 cM) and alleles within a cluster should be prone to linkage during transmission with a low segregation rate. Although the limitation of the sample size of family data in this study, the MLOD value of X-STR pairs and low recombination rates were observed between pairwise STR loci within a cluster in this study and our results show no significant bias compared to other family-based studies in most X-STRs in previous reports as expected (Nothnagel et al., 2012; Bini et al., 2019). The lower events of intra-group recombination are beneficial for more accurate judgment in kinship relationships (Kling et al., 2021). We also found the discrepancy for pairs like DXS10074-DXS10075 (Rc = 0.0225 despite ∼21 kb distance and complete LD) and HPRTB-DXS10101 (Rc = 0.0206 vs. prior 0.0000 (Yang et al., 2019). With the inclusion of 95% CIs, the lower bounds of observed recombination rates for pairs like DXS10074-DXS10075 approach zero. Furthermore, Fisher’s exact tests showed no statistically significant difference (p = 0.102 For DXS10074-DXS10075 and p = 0.095 For HPRTB-DXS10101 between combination rates in this study and reported by prior studies (Yang et al., 2019)). This suggests the rare events observed likely reflect stochastic fluctuations or gene conversions rather than systematic divergence. On the other hand, recombination rates may be attributed to population-specific recombination hotspots (Hinch et al., 2011), which are controlled by the highly polymorphic PRDM9 gene (Baudat et al., 2010), leading to varying recombination landscapes between Chinese Han and other populations.

The physical length of the X chromosome is 155 Mb (about 180 cM) and it is generally considered that there should be only 3 or 4 independent X-STR linkage groups on the X chromosome (Tillmar et al., 2017). However, the MLOD analysis and the investigation of X-SNPs using the 1kGP cohort both revealed an unlinked association between linkage groups during meiosis. Therefore, the inheritance of these 7 linkage groups could be considered to be independent. In contrast, a relatively low recombination rate was observed using two junction X-STRs from adjacent clusters. There might lie in the undervaluation of recombination rate due to the unphased X-STR genotypes from mothers or a small sample size (Hering et al., 2010). It might also be possible to overestimate the recombination fraction due to the occurrence of non-crossover recombination events at SNP regions (Saito and Colaiacovo, 2017). The LDdecay result of X chromosome support this: the R2 value of LD will be reduced to 0.1 when the distance between SNP pairs is over 3.7 kb, while in X-STRs the distance is very short and almost negligible when considering linkage relationship. Non-crossover events (e.g., , gene conversion) may be detected by high-density SNP arrays as epigenetic recombination events that occur in extremely short distances, even without true chromosome fragment exchanges. If these events are mistakenly counted as restructuring, it may exaggerate the restructuring rate. However, the 10 kb window used in this study is relatively large, making it less susceptible to the effects of short distance gene switching compared to single SNP analysis (Charlesworth, 2017). The observed differences in recombination rates between family studies and population SNP data are not methodological flaws, but reflect differences in the measured biological processes (meiotic recombination and historical recombination/LD). The unique genetic pattern and effective population size of the X chromosome are key factors contributing to this paradox. It is important to note that these two strategies capture different biological timescales: pedigree analysis directly observes recent meiotic recombination events relevant to immediate kinship, whereas SNP-based LD analysis reflects historical recombination and evolutionary history accumulated over generations. Although the results of two methods have some differences, they both supported independent inheritance between studied 7 linkage groups and provide complementary perspectives and jointly depict the complex picture of X chromosome linkage relationship.

The difference of LD heatmaps between LG2-LG3-LG6 and LG1-LG4-LG7 underscores regional variation in X chromosomal linkage architecture. The contiguous high LD blocks observed for LG2, LG3 and LG6 reflect dense SNP spacing, low historical recombination and stable co-transmission, corroborating our pedigree-based evidence for tight linkage within these linkage groups. Conversely, the fragmented LD in LG1, LG4 and LG7 may arise from sparse informative SNPs, higher local recombination rates or allele frequency variation, all of which attenuate detectable LD. Conversely, the fragmented LD patterns observed in LG1, LG4, and LG7 likely reflect the presence of local recombination hotspots or structural complexities. Unlike the ‘cold’ regions of LG2 and LG6 which preserve long haplotypes, these fragmented regions may contain active recombination initiation sites defined by PRDM9 binding, leading to the rapid decay of linkage disequilibrium over short physical distances (Baudat et al., 2010). Furthermore, the X chromosome is known to harbor complex repetitive elements; such structural variations could potentially reduce the density of stable informative SNPs in these specific loci, contributing to the observed discontinuity in LD blocks (Williams et al., 2015; Song et al., 2007). These findings highlight that SNP-based linkage validation is robust for most linkage groups but also reveals structural complexity that should be considered when in forensic application. LD decay analysis further revealed that the X chromosome exhibits a slower decay rate than autosomes (e.g., chromosomes 1, 10, and 21), suggesting extended haplotype blocks. The characteristics of lower recombination rate and lower mutation rate lead to faster genetic drift. Therefore, this makes LD and population structure of the X chromosome more robust (Garcia et al., 2022). However, this does not imply stronger observable LD among X-STRs. In fact, the low recombination rate and the unique inheritance mechanism of the X chromosome (i.e., lack of male recombination) may reduce the power of LD-based analyzes in population samples. Moreover, the use of unrelated individuals may underestimate true LD due to limited ability to detect co-segregation events.

Overall, while population-based LD tests provide valuable insights into the historical genetic structure of the X chromosome, they should be viewed as complementary to, rather than a direct substitute for, pedigree analysis. Specifically, for forensic kinship applications requiring precise estimates of recent transmission probabilities, reliance solely on population LD metrics may be insufficient due to the disconnect between historical LD patterns and current meiotic recombination rate.

With the development of public WGS phased data and the database of the X-STR, future studies with deeper family structures or long-read phased sequencing may refine the linkage and recombination relationship of the X chromosome. We believe the understanding of X chromosome linkage and recombination will be more profound and accurate, and better applied in the fields of forensic science and genetics.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.internationalgenome.org/data, 1000 Genomes Project.

Ethics statement

The studies involving humans were approved by the Ethics Committee of the School of Basic Medical Sciences, Fudan University. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

JQ: Data curation, Methodology, Software, Writing – original draft. XQ: Conceptualization, Writing – original draft. QJ: Writing – review and editing. ZL: Resources, Writing – review and editing. CS: Resources, Writing – review and editing. HX: Resources, Writing – review and editing. FY: Funding acquisition, Writing – review and editing. JX: Conceptualization, Funding acquisition, Writing – review and editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This study was supported by Shanghai Public Security Bureau Science and Technology Project (2023005) and the Opening Project of Shanghai Key Laboratory of Crime Scene Evidence (2023XCWZK01).

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2025.1727583/full#supplementary-material

References

Aneli, S., Fariselli, P., Chierto, E., Bini, C., Robino, C., and Birolo, G. (2023). Recombulator-X: a fast and user-friendly tool for estimating X chromosome recombination rates in forensic genetics. PLoS Comput. Biol. 19, e1011474. doi:10.1371/journal.pcbi.1011474

PubMed Abstract | CrossRef Full Text | Google Scholar

Baudat, F., Buard, J., Grey, C., Fledel-Alon, A., Ober, C., Przeworski, M., et al. (2010). PRDM9 is a major determinant of meiotic recombination hotspots in humans and mice. Science 327, 836–840. doi:10.1126/science.1183439

PubMed Abstract | CrossRef Full Text | Google Scholar

Bini, C., Di Nunzio, C., Aneli, S., Sarno, S., Alù, M., Carnevali, E., et al. (2019). Analysis of recombination and mutation events for 12 X-Chr STR loci: a collaborative family study of the Italian speaking working group ge. FI. Forensic Sci. Int. Genet. Suppl. Ser. 7, 398–400. doi:10.1016/j.fsigss.2019.10.027

CrossRef Full Text | Google Scholar

Byrska-Bishop, M., Evani, U. S., Zhao, X., Basile, A. O., Abel, H. J., Regier, A. A., et al. (2022). High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios. Cell 185, 3426–3440.e19. doi:10.1016/j.cell.2022.08.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Charlesworth, D. (2017). Evolution of recombination rates between sex chromosomes. Philos. Trans. R. Soc. Lond B Biol. Sci. 372, 20160456. doi:10.1098/rstb.2016.0456

PubMed Abstract | CrossRef Full Text | Google Scholar

Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., Depristo, M. A., et al. (2011). The variant call format and VCFtools. Bioinformatics 27, 2156–2158. doi:10.1093/bioinformatics/btr330

PubMed Abstract | CrossRef Full Text | Google Scholar

Dong, S. S., He, W. M., Ji, J. J., Zhang, C., Guo, Y., and Yang, T. L. (2021). LDBlockShow: a fast and convenient tool for visualizing linkage disequilibrium and haplotype blocks based on variant call format files. Brief. Bioinform 22, bbaa227. doi:10.1093/bib/bbaa227

PubMed Abstract | CrossRef Full Text | Google Scholar

Gabriel, S. B., Schaffner, S. F., Nguyen, H., Moore, J. M., Roy, J., Blumenstiel, B., et al. (2002). The structure of haplotype blocks in the human genome. Science 296, 2225–2229. doi:10.1126/science.1069424

PubMed Abstract | CrossRef Full Text | Google Scholar

Garcia, F. M., Bessa, B. G. O., Dos Santos, E. V. W., Pereira, J. D. P., Alves, L. N. R., Vianna, L. A., et al. (2022). Forensic applications of markers present on the X chromosome. Genes (Basel) 13, 1597. doi:10.3390/genes13091597

PubMed Abstract | CrossRef Full Text | Google Scholar

Gomes, I., Pinto, N., Antao-Sousa, S., Gomes, V., Gusmao, L., and Amorim, A. (2020). Twenty years later: a comprehensive review of the X chromosome use in forensic genetics. Front. Genet. 11, 926. doi:10.3389/fgene.2020.00926

PubMed Abstract | CrossRef Full Text | Google Scholar

Hering, S., Edelmann, J., Augustin, C., Kuhlisch, E., and Szibor, R. (2010). X chromosomal recombination--a family study analysing 39 STR markers in German three-generation pedigrees. Int. J. Leg. Med. 124, 483–491. doi:10.1007/s00414-009-0387-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Hering, S., Edelmann, J., Haas, S., and Grasern, N. (2015). Paternity testing of two female siblings with investigator argus X-12 kit: a case with several rare mutation and recombination events. Forensic Sci. Int. Genet. Suppl. Ser. 5, e341–e343. doi:10.1016/j.fsigss.2015.09.135

CrossRef Full Text | Google Scholar

Hinch, A. G., Tandon, A., Patterson, N., Song, Y., Rohland, N., Palmer, C. D., et al. (2011). The landscape of recombination in African Americans. Nature 476, 170–175. doi:10.1038/nature10336

PubMed Abstract | CrossRef Full Text | Google Scholar

Inturri, S., Menegon, S., Amoroso, A., Torre, C., and Robino, C. (2011). Linkage and linkage disequilibrium analysis of X-STRs in Italian families. Forensic Sci. Int. Genet. 5, 152–154. doi:10.1016/j.fsigen.2010.10.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Jeffreys, A. J., and May, C. A. (2004). Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nat. Genet. 36, 151–156. doi:10.1038/ng1287

PubMed Abstract | CrossRef Full Text | Google Scholar

Kling, D., Phillips, C., Kennett, D., and Tillmar, A. (2021). Investigative genetic genealogy: current methods, knowledge and practice. Forensic Sci. Int. Genet. 52, 102474. doi:10.1016/j.fsigen.2021.102474

PubMed Abstract | CrossRef Full Text | Google Scholar

Lange, K., Papp, J. C., Sinsheimer, J. S., Sripracha, R., Zhou, H., and Sobel, E. M. (2013). Mendel: the Swiss army knife of genetic analysis programs. Bioinformatics 29, 1568–1570. doi:10.1093/bioinformatics/btt187

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, Q. L., Li, Z. D., Li, C. T., Zhao, H., Wu, Y. D., Li, Q., et al. (2013). X chromosomal recombination--a family study analyzing 26 X-STR loci in Chinese Han three-generation pedigrees. Electrophoresis 34, 3016–3022. doi:10.1002/elps.201300204

PubMed Abstract | CrossRef Full Text | Google Scholar

Morimoto, C., Manabe, S., Kawaguchi, T., Kawai, C., Fujimoto, S., Hamano, Y., et al. (2016). Pairwise kinship analysis by the index of chromosome sharing using high-density single nucleotide polymorphisms. PLoS One 11, e0160287. doi:10.1371/journal.pone.0160287

PubMed Abstract | CrossRef Full Text | Google Scholar

Nothnagel, M., Szibor, R., Vollrath, O., Augustin, C., Edelmann, J., Geppert, M., et al. (2012). Collaborative genetic mapping of 12 forensic short tandem repeat (STR) loci on the human X chromosome. Forensic Sci. Int. Genet. 6, 778–784. doi:10.1016/j.fsigen.2012.02.015

PubMed Abstract | CrossRef Full Text | Google Scholar

Novroski, N. M. M., Wendt, F. R., Woerner, A. E., Bus, M. M., Coble, M., and Budowle, B. (2019). Expanding beyond the current core STR loci: an exploration of 73 STR markers with increased diversity for enhanced DNA mixture deconvolution. Forensic Sci. Int. Genet. 38, 121–129. doi:10.1016/j.fsigen.2018.10.013

PubMed Abstract | CrossRef Full Text | Google Scholar

O'Connor, K. L., Hill, C. R., Vallone, P. M., and Butler, J. M. (2011). Linkage disequilibrium analysis of D12S391 and vWA in U.S. population and paternity samples. Forensic Sci. Int. Genet. 5, 538–540. doi:10.1016/j.fsigen.2010.09.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Perera, N., Wijithalal, R., Galhena, G., and Ranawaka, G. (2022). Linkage, recombination and mutation rate analyses of 16 X-chromosomal STR loci in Sri Lankan sinhalese pedigrees. Int. J. Leg. Med. 136, 415–422. doi:10.1007/s00414-021-02762-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Phillips, C., Ballard, D., Gill, P., Court, D. S., Carracedo, A., and Lareu, M. V. (2012). The recombination landscape around forensic STRs: accurate measurement of genetic distances between syntenic STR pairs using HapMap high density SNP data. Forensic Sci. Int. Genet. 6, 354–365. doi:10.1016/j.fsigen.2011.07.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Saito, T. T., and Colaiacovo, M. P. (2017). Regulation of crossover frequency and distribution during meiotic recombination. Cold Spring Harb. Symp. Quant. Biol. 82, 223–234. doi:10.1101/sqb.2017.82.034132

PubMed Abstract | CrossRef Full Text | Google Scholar

Shrivastava, P., Jain, T., and Trivedi, V. (2014). Usefulness of X STR haplotype markers in forensic DNA profiling. Helix 4, 582–589. Available online at: http://helix.dnares.in/wp-content/uploads/2018/01/6_Helix_582-589.pdf.

Google Scholar

Song, Y. S., Ding, Z., Gusfield, D., Langley, C. H., and Wu, Y. (2007). Algorithms to distinguish the role of gene-conversion from single-crossover recombination in the derivation of SNP sequences in populations. J. Comput. Biol. 14, 1273–1286. doi:10.1089/cmb.2007.0096

PubMed Abstract | CrossRef Full Text | Google Scholar

Sufian, A., Hosen, M. I., Fatema, K., Hossain, T., Hasan, M. M., Mazumder, A. K., et al. (2017). Genetic diversity study on 12 X-STR loci of investigator(R) argus X STR kit in Bangladeshi population. Int. J. Leg. Med. 131, 963–965. doi:10.1007/s00414-016-1513-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Tao, R., Zhang, J., Xia, R., Yang, Z., Wang, S., Zhang, X., et al. (2020). Genetic investigation and phylogenetic analysis of three Chinese ethnic groups using 16 X chromosome STR loci. Ann. Hum. Biol. 47, 59–64. doi:10.1080/03014460.2019.1704871

PubMed Abstract | CrossRef Full Text | Google Scholar

Tillmar, A. O. (2012). Population genetic analysis of 12 X-STRs in Swedish population. Forensic Sci. Int. Genet. 6, e80–e81. doi:10.1016/j.fsigen.2011.07.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Tillmar, A. O., Kling, D., Butler, J. M., Parson, W., Prinz, M., Schneider, P. M., et al. (2017). DNA commission of the International society for forensic genetics (ISFG): guidelines on the use of X-STRs in kinship analysis. Forensic Sci. Int. Genet. 29, 269–275. doi:10.1016/j.fsigen.2017.05.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Tomas, C., Pereira, V., and Morling, N. (2012). Analysis of 12 X-STRs in greenlanders, danes and somalis using Argus X-12. Int. J. Leg. Med. 126, 121–128. doi:10.1007/s00414-011-0609-y

PubMed Abstract | CrossRef Full Text | Google Scholar

White, J. A., Mcalpine, P. J., Antonarakis, S., Cann, H., Eppig, J. T., Frazer, K., et al. (1997). Guidelines for human gene nomenclature (1997). HUGO nomenclature committee. Genomics 45, 468–471. doi:10.1006/geno.1997.4979

PubMed Abstract | CrossRef Full Text | Google Scholar

Williams, A. L., Genovese, G., Dyer, T., Altemose, N., Truax, K., Jun, G., et al. (2015). Non-crossover gene conversions show strong GC bias and unexpected clustering in humans. Elife 4, e04637. doi:10.7554/eLife.04637

PubMed Abstract | CrossRef Full Text | Google Scholar

Xiao, C., Yang, X., Liu, H., Liu, C., Yu, Z., Chen, L., et al. (2021). Validation and forensic application of a new 19 X-STR loci multiplex system. Leg. Med. (Tokyo) 53, 101957. doi:10.1016/j.legalmed.2021.101957

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, X., Zhang, X., Zhu, J., Chen, L., Liu, C., Feng, X., et al. (2017). Genetic analysis of 19 X chromosome STR loci for forensic purposes in four Chinese ethnic groups. Sci. Rep. 7, 42782. doi:10.1038/srep42782

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, X., Chen, Y., Zeng, X., Chen, L., Liu, C., Liu, H., et al. (2019). Linkage, recombination, and mutation rate analyses of 19 X-chromosomal STR loci in Chinese southern Han pedigrees. Int. J. Leg. Med. 133, 1691–1698. doi:10.1007/s00414-019-02121-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, Q., Qian, J., Shao, C., Yao, Y., Zhou, Z., Xu, H., et al. (2021). Identification and characterization of nine novel X-Chromosomal short tandem repeats on Xp21.1, Xq21.31, and Xq23 regions. Front. Genet. 12, 784605. doi:10.3389/fgene.2021.784605

PubMed Abstract | CrossRef Full Text | Google Scholar

Yoo, Y. J., and Mendell, N. R. (2008). The power and robustness of maximum LOD score statistics. Ann. Hum. Genet. 72, 566–574. doi:10.1111/j.1469-1809.2008.00442.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, S., Zhao, S., Zhu, R., and Li, C. (2012). Genetic polymorphisms of 12 X-STR for forensic purposes in shanghai Han population from China. Mol. Biol. Rep. 39, 5705–5707. doi:10.1007/s11033-011-1379-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y. D., Shen, C. M., Meng, H. T., Guo, Y. X., Dong, Q., Yang, G., et al. (2016). Allele and haplotype diversity of new multiplex of 19 ChrX-STR loci in Han population from Guanzhong region (China). Electrophoresis 37, 1669–1675. doi:10.1002/elps.201500425

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, C., Dong, S. S., Xu, J. Y., He, W. M., and Yang, T. L. (2019). PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics 35, 1786–1788. doi:10.1093/bioinformatics/bty875

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: genetic marker, kinship analysis, linkage groups, recombination, SNP, X-chromosomal STRs

Citation: Qian J, Qian X, Ji Q, Li Z, Shao C, Xu H, Yang F and Xie J (2025) Integrative linkage and recombination analysis of 25 X-STRs across 7 linkage groups using pedigree-based and SNP-based strategies. Front. Genet. 16:1727583. doi: 10.3389/fgene.2025.1727583

Received: 18 October 2025; Accepted: 01 December 2025;
Published: 18 December 2025.

Edited by:

Michael David Martin, Norwegian University of Science and Technology, Norway

Reviewed by:

Muhammad Farhan Khan, University of Health Sciences, Pakistan
Lihong Fu, Hebei Medical University, China

Copyright © 2025 Qian, Qian, Ji, Li, Shao, Xu, Yang and Xie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Fan Yang, ZmFuX3lhbmdfMjdAMTYzLmNvbQ==; Jianhui Xie, amh4aWVAZnVkYW4uZWR1LmNu

These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.