Twenty Years Later: A Comprehensive Review of the X Chromosome Use in Forensic Genetics

The unique structure of the X chromosome shaped by evolution has led to the present gender-specific genetic differences, which are not shared by its counterpart, the Y chromosome, and neither by the autosomes. In males, recombination between the X and Y chromosomes is limited to the pseudoautosomal regions, PAR1 and PAR2; therefore, in males, the X chromosome is (almost) entirely transmitted to female offspring. On the other hand, the X chromosome is present in females with two copies that recombine along the whole chromosome during female meiosis and that is transmitted to both female and male descendants. These transmission characteristics, besides the obvious clinical impact (sex chromosome aneuploidies are extremely frequent), make the X chromosome an irreplaceable genetic tool for population genetic-based studies as well as for kinship and forensic investigations. In the early 2000s, the number of publications using X-chromosomal polymorphisms in forensic and population genetic applications increased steadily. However, nearly 20 years later, we observe a conspicuous decrease in the rate of these publications. In light of this observation, the main aim of this article is to provide a comprehensive review of the advances and applications of X-chromosomal markers in population and forensic genetics over the last two decades. The foremost relevant topics are addressed as: (i) developments concerning the number and types of markers available, with special emphasis on short tandem repeat (STR) polymorphisms (STR nomenclatures and practical concerns); (ii) overview of worldwide population (frequency) data; (iii) the use of X-chromosomal markers in (complex) kinship testing and the forensic statistical evaluation of evidence; (iv) segregation and mutation studies; and (v) current weaknesses and future prospects.


INTRODUCTION
The X chromosome has many characteristics that are not shared by its counterpart, the Y chromosome, or by any of the autosomes of the mammalian genome. Its unique structural characteristics have been shaped by evolution, leading to the present known gender-specific genetic differences (Lahn and Page, 1999;Schaffner, 2004). In males, the single copy of the X chromosome does not allow for recombination to occur (except for the pseudoautosomal regions, PARs, which maintain homology by recombining during male meiosis). The nonrecombining regions and the PAR 1 and PAR 2 regions of the X and Y chromosomes have taken different evolutionary paths becoming highly differentiated due to different functional roles, and consequently, only a few X-Y sequence similarities remain among them (Lahn and Page, 1999). Mutation events have gathered on the Y chromosome, and in addition to the lack of recombination, these events have contributed to the loss of most of the Y chromosome's sequence and genes emerging in a distinctive configuration of repeated sequences (Lahn and Page, 1999;Schaffner, 2004) becoming specialized in male sex determination. On the other hand, the X chromosome has preserved its autosomal character, becoming one of the most stable nuclear chromosomes, holding the largest and most conserved gene arrangement across eutherian ("placental") mammals (Lahn and Page, 1999;Kohn et al., 2004;Schaffner, 2004). It is the only chromosome to have one of its pair inactivated in one sex (females), and it is "corrupted" with repeat elements, making it especially tough to produce a detailed gene sequence (Gunter, 2005). In 2005, Ross et al. (2005) published the first draft that covered approximately 99.3% of the human X chromosome euchromatic sequence. The X chromosome holds a size length of approximately 155 million base pairs (Mb) (Ross et al., 2005), representing nearly 5% of the estimated human genome size (3,200 Mb) (Lander et al., 2001). Regarding some of the X chromosome's structural properties, it presents a low GC content (39%) when compared to 41% of the genome average (Ross et al., 2005). The low number of functional genes detected confers the chromosome one of the lowest gene densities among the chromosomes annotated to date (Ross et al., 2005). The X chromosome's sequence data revealed not only a low concentration of genes but also small gene length as only 1.7% of the chromosome sequence is represented by exons of the identified genes, responsible for transcribing 33% of the X chromosome (Ross et al., 2005). The particular genetic characteristics of the X chromosome, shaped by evolution, are responsible for the distinctive gender-specific features (Figure 1): in the male gender, the X chromosome is (almost entirely) transmitted to females as an unchanged block. While in females, the two copies recombine, like autosomes, reorganizing genetic variation in each generation, which contributes to the increase of haplotype diversity (Schaffner, 2004). The new reshuffled chromosome is then transmitted to female and male descendants (Figure 1).
These specific properties -two recombining copies in females and a single non-recombining copy in males (except for the PAR regions) creating haplotypes -provide X chromosome markers FIGURE 1 | X-chromosomal inheritance. Female and male descendants inherit a recombined maternal X chromosome (1) that resulted from female meiosis. Female offspring inherit one paternal unchanged X chromosome (2) due to lack of recombination [with exception for the pseudoautosomal region (PAR) regions]. a particular place in forensics and in population genetics, as well as in other research areas such as human evolutionary studies and medical genetics (e.g., X-linked recessive disorders such as hemophilia or Duchenne muscular dystrophy) (Szibor, 2007). Regarding forensic and population genetic applications, the X chromosome's mode of inheritance places this chromosome among the autosomes and the uniparental-inherited genomes [mitochondrial DNA (mtDNA) and Y chromosome] providing desirable and exclusive features that are not provided by any other of the latter.
In the early 2000s, the number of publications using X-chromosomal polymorphisms in these areas of research increased steadily. However, nearly 20 years later, a conspicuous decrease in the rate of these publications is observed. For example, X chromosome short tandem repeat (X-STR) forensicbased publications reached as many as 43 publications in a single year (2009), while in the past year of 2019, only 18 publications were found (complete results and detailed information on the criteria used for database search are presented and discussed under the section "Factors Underlying the Relative Stagnation in X Chromosome Forensic Research"). In light of these observations, the main aim of the present work is to provide an up-to-date and objective review of the advances and applications of X-chromosomal markers in population and forensic genetics over the last two decades since the bloom observed in the early and mid-2000s.
CURRENT DEVELOPMENTS: NUMBERS AND TYPES OF X-CHROMOSOMAL MARKERS AVAILABLE (SHORT TANDEM REPEATS, SINGLE-NUCLEOTIDE POLYMORPHISMS, AND INSERTIONS/DELETIONS) The use of X chromosome polymorphisms in human identification and in population genetics is mainly supported by the potential applications that outcome from its unique properties. Solely, or complementing the information provided by the autosomes or by markers located on the Y chromosome or mtDNA, X chromosome markers may provide essential information in many different lines of research. It must be highlighted that identity testing using X-STRs in particular contexts, namely in scenarios of (complex) kinship testing, may be the only tool to unravel certain cases. Examples of complex kinship testing scenarios where the prominent role of X chromosome polymorphisms is demonstrated are given in the section "The Use of X-Chromosomal Markers in (Complex) Kinship Testing." In the present section, we will try to draw the state of the art of the genetic markers that have been described, to date, in the X chromosome-specific region, i.e., leaving out PARs and amelogenin. Special attention is given to X-STRs as a result of their favorite usage in forensic genetics due to high standardization and existence of commercial typing kits. Although some of the first publications reporting X-STRs appeared in the late 90s (Edwards et al., 1991(Edwards et al., , 1992Hearne and Todd, 1991;Sleddens et al., 1992), the beginning of the century marked the increase of X-STR publications that focused on the development of new multiplexes on the genetic characterization of many different population groups (databasing) and on kinship and forensic investigations.
An extensive literature review was undertaken, with special focus on forensic-population genetic publications. Results are analyzed and tabulated separately for each type of marker, including relevant references. Supplementary Table 1 lists 85 STR loci in which usage in forensic-population genetic context was reported. In agreement with the study of Szibor et al. (2005), HumARA marker was not considered for ethical reasons. Although the number of X chromosome markers has grown since the 2007 seminal review of Szibor (2007), this growth may be illusory, since many markers were used quite rarely, sometimes only once.
Although a considerable number of X-STRs are available in the literature, a better view of their real, current usage may be given by the analysis of the multiplexes, which have been described for their genotyping. Table 1 shows the most used in-house and commercially developed X-STR multiplexes in which we update the revision of Diegoli (2015) and demonstrate clearly that the effective number of STRs routinely used is modest.
In any case, due to their high degree of discrimination, the number of standardized STRs is sufficient for most routine investigations, as will be discussed below in the section "The Use of X-Chromosomal Markers in (Complex) Kinship Testing." Novel interesting STRs for forensic applications continue being described (Nishi et al., 2020). Despite the wide set of available X-STR markers as well as many population-based studies (see section "Overview of Worldwide (Published) X Chromosome Short Tandem Repeat Population Data") that have emerged over these years, no effective X-STR database exists harboring this type of data. Some of the published population datasets are available in the FamlinkX web page 1 in a format that can be directly uploaded for kinship calculation using the software. Efforts were made by Szibor et al. (2006) to create an X-STR database 2 (ChrX-Str.org 2.0, 2020) that could anchor population data (namely, haplotype frequencies), calculation of forensically relevant parameters, information on markers such as multiplex kits, etc. Nevertheless, it seems that no updates have been made to this database, specifically in what regards population data submission, as only four populations are currently available (German, Ghanesen, Japanese, and Chinese Han) ("Haplotypes"; see text footnote 2). In addition, it is however noteworthy that no autosomal STR database such as NIST STRbase (National Institute of Standards, and Technology [Nist], 2020) or STRidER (2020) 3 contains information on X-STRs either. This approach could be considered: autosomal types of database could potentially serve as harbor for X-STR data undergoing the same quality control (QC) submission criteria. In fact, several forensic-focused journals such as the Forensic Science International: Genetics and the International Journal of Legal Medicine have published minimum requirements for publication of forensic population data from different genomic markers (e.g., autosomal, Y-chromosomal, mtDNA) (Parson and Roewer, 2010;Gusmão et al., 2017). Submission of such data to these journals requires preliminary QC assessment and inclusion in public online databases. These requirements could certainly be applied to X-chromosomal type of markers, ensuring the same quality type of data submitted. STRs are undoubtedly the preferential markers in human identification applications. Some of the main features that make STRs desirable markers are (i) highly polymorphic, i.e., high discriminating capacity between individuals; (ii) technical easiness due to rapid analysis with PCR-based technology and capillary electrophoresis automated fluorescent detection; and (iii) ability for generating STR multiplexes with small amplicon lengths for degraded DNA. The same cannot be said about insertions/deletions (INDELs), although these share some of the features of STRs (technical ease of analyses by PCR and ability for multiplexing), standardization is much less advanced perhaps due to the need of a much higher number of markers for a high degree of discrimination among individuals. Nevertheless, INDELs represent another potential tool for addressing human genetic identification issues. In Table 2, we list the X chromosomespecific INDEL polymorphisms genotyping systems described in forensic literature. Unsurprisingly, not as many X chromosome INDEL marker systems have been described as compared to autosomal INDELs (e.g., Pereira et al., 2009;Freitas et al., 2010;Zaumsegel et al., 2013). In fact, no commercial kits being available, few systems have been subject to interlaboratorial comparisons, as in the case of autosomal INDELs, which stood international collaborative exercises (Pereira et al., 2018). One of the possible motifs for the lack of commercial kits is possibly due to the limited applications of X chromosome polymorphisms in forensic genetics when compared to autosomal markers. An interesting alternative typing approach, however, albeit of difficult analysis, is the one described in the studies of Fan et al. (2015Fan et al. ( , 2016 in which amplicons comprise various INDELs, i.e., biallelic loci that are tightly linked composing a new marker and that are amplified by a single pair of PCR primers. With respect to X chromosome single nucleotide polymorphisms (X-SNPs), the analysis of the state of the art is even more complex due to the diversity of non-standardized genotyping systems and platforms, which have not been submitted to interlaboratorial comparisons. In Table 3, a summary of the actual forensic use of X chromosome-specific SNPs is shown. The number of table entries gives a false impression of abundance of X-SNPs; in fact, besides the MALDI-TOF mass spectrometry Petkovski et al. (2005) 10 qPCR (TaqMan probes) Zarrabeitia et al. (2007) 25 SNaPshot Tomas et al. (2010) 16 SNaPshot Oki et al. (2012) 14 qPCR (Taqman probes) Li et al. (2010) mentioned limitations, the number of SNPs overlap is very low. Although the binary nature of SNPs may favor degraded DNA as well as automation and high-throughput genotyping (e.g., in individual identification using complex kinship analyses in highly degraded scenarios such as natural or human-made disasters), the information content is considerably lower than for STR loci and consequently a larger number of SNPs are needed to match the discrimination power of the commonly used STRs (e.g., Chakraborty et al., 1999;Amorim and Pereira, 2005). Consequently, more loci mean more amplification products, which increases difficulty in data interpretation of DNA profile mixtures. In a multiple-donor sample interpretation, identification of each contributor may be very complex with biallelic systems. The limited number of alleles for each locus (normally two alleles) becomes hard to interpret because overlap will occur and multiple donors become hard to distinguish (Butler et al., 2007;Budowle and van Daal, 2008). Adding the mentioned data interpretation complexity in mixed profiles to the limited applications of X chromosome markers can potentially justify the lack of interest in X-SNPs observed.

OVERVIEW OF WORLDWIDE (PUBLISHED) X CHROMOSOME SHORT TANDEM REPEAT POPULATION DATA
For an overview of the worldwide population allele frequency datasets of X-STRs used in forensic genetics, we have consulted the articles available in PubMed database and in the congress proceedings of the International Society for Forensic Genetics 4 (The International Society for Forensic Genetics [ISFG], 2020).
This search resulted in a total of 269 articles. The first genetic studies with focus on genotyping X-STRs for forensic application start emerging in the year 1999. Since then, and until 2008, a remarkable increase of population data publications was observed (Figure 2A). Nevertheless, reported information on human X-STRs in different worldwide populations has been stagnating in the last years. Information concerning the populations, number of male and female samples, and X-STRs analyzed was compiled using 236 publications out of the 269 consulted (see Supplementary  Table 2). The remaining were excluded for different reasons, which include articles that were not in English, with overlapping data (in this case, the most updated dataset was considered), and with unclear information concerning population, markers, or total samples analyzed. Therefore and although some of these studies contain relevant information on X-STR variation (e.g., the study by Edelmann et al., 2006, which has data for DXS9908 and DXS7127 markers), these were not included in Supplementary Table 2. Furthermore, the study by Phillips et al. (2018) reports data on seven X-STRs for a large sample of 944 individuals from the HGDP-CEPH human genome diversity panel. However, since this dataset comprises samples from 51 populations with relatively low sample sizes, the results were compiled for seven continentally defined population groups, namely, African (sub-Saharan), European, Middle East (including North Africans), Central-South Asian, East Asian, Oceanian, and Native American.
In Figure 2B, it is possible to observe that the number of X-STRs analyzed is highly variable among publications with some studies genotyping a high number of X-STRs (e.g., Liu et al., 2013;Fukuta et al., 2019) and others genotyping a reduced number of loci (e.g., Watanabe et al., 2000;Koyama et al., 2002;Carvalho and Pinheiro, 2011). The number of markers included in each study varied from 1 to 27. In 47% of the cases, this number was between 10 and 12 X-STRs, in 31%, it was below 10, and 22% of the datasets included more than 12 makers ( Figure 2B). The number of markers available per dataset is somehow related to the use of commercial kits in 37.4% of the population studies (Supplementary Table 2). The first commercial kit that was optimized for forensic applications was the Argus X-UL from Biotype (Dresden, Germany), containing four X-STRs (DXS8378, DXS7132, HPRTB, and DXS7423) located in distant positions along the chromosome to avoid linkage. This kit was soon expanded (Argus X-8) with four additional X-STRs (DXS10135, DXS10074, and DXS10134), creating four pairs of linked X-STRs. The Argus X-12 (Qiagen, Hilden, Germany) is the most recent version of the Argus kit and is the most widely used (an optimized version is now available, the Argus X-12 QS, but that contains the same markers). It comprises 12 X-STRs organized in four linkage groups: LG1, DXS10148/DXS10135/DXS8378; LG2, DXS7132/DXS10079/DXS10074; LG3, DXS10103/HPRTB /DXS10101; and LG4, DXS10146/DXS10134/DXS7423. The Goldeneye DNA ID System 17X (Goldeneye Technology Co., Ltd., Beijing, China) and the AGCU X19 STR kit (Wuxi Sino-German Meilian Biotechnology Co., Jiangsu, China) were also developed for forensic applications, although available data are virtually restricted to Chinese populations. Among in-house  Supplementary Table 2. multiplexes, the Decaplex system developed by the GHEP-ISFG (Spanish and Portuguese Speaking working group of the ISFG) (Gusmão et al., 2008) has been the most widely used (14.6% of the population datasets were generated using this multiplex).
From the 84 markers that have been described as informative for forensic applications [including HumARA that is no longer used due to ethical issues (Szibor et al., 2005), as already mentioned], less than 50% were studied in more than 10 populations, and 29 were only reported in a single population ( Figure 2C). The loci with more allele frequency data accumulated are those included in the commercial kits (namely, Investigator Argus X-12 kit, Qiagen) or in the in-housedeveloped Decaplex-GHEP-ISFG (Gusmão et al., 2008).
In Supplementary Table 2, the geographical distribution of the published human population data for X-STRs since 1999 is described. Notwithstanding the exhaustive nature of this review, it is possible that some studies are missing from this table. However, we believe that most forensic population studies on X-STRs have been identified, allowing a realistic picture of the state of the art. For a broader overview of the populations sampled, we have represented the number of datasets that have been published until now by country (Figure 3). The datasets were counted considering the number of subpopulations or ethnic groups in each publication. Populations defined at continental level (namely, the HGDP-CEPH and Africa datasets) or belonging to ethnic affiliated populations from different countries (namely, the Jews) have been excluded. In Figure 3, it is possible to observe that apart from a lack of X-STR data information for many countries, there is high heterogeneity among and inside continents. Data are scarcer in some geographical areas, namely, for sub-Saharan African and American populations (except for Argentina, Brazil, and United States). On the other hand, a large quantity of X-STR data was obtained for other populations, such as the ones from China. China is by far the best represented country not only because of the higher number of publications but also due to the inclusion of various ethnic groups in a single study. Although for some countries a large number of datasets are available for the same X-STR loci, many of those studies characterize different regions or subpopulations, which is relevant to investigate population stratification inside the country, especially when a high diversity of ethnicities coexists.
Overall, the compiled information clearly shows an imbalance between the total number of publications and the asymmetric representation of the worldwide populations. In fact, for several populations from different geographic regions, data on X-STR remain largely scarce, being the available information representative of only a small fraction of the worldwide human populations. Moreover, apart from a large variation concerning the X-STRs included in each study, many only comprise a small number of loci. Due to proximity on the chromosome, it is expected that some of the studied markers will be in linkage disequilibrium (LD) in many populations. However, data on haplotype frequencies are almost restricted to recent papers and not available for most publications consulted, invalidating the use of some of the available data in forensic applications.
Therefore, further studies on haplotype frequency distributions, as well as on mutation rates and LD, are mandatory to attain the final goal of establishing highly comprehensive and representative human reference X-STR databases.
The observed increase of X-STR studies over the years justifies the need to evaluate X-STR nomenclature being used at least for the most common polymorphisms. Several studies have gathered considerable sequencing data for some of the commonly used X-STRs (Gomes et al., , 2016(Gomes et al., , 2017Szibor et al., 2009). In these latter studies, relevant findings were reported for several markers, which demonstrate that accurate allele nomenclature designation taking into consideration the ISFG recommendations (Gusmão et al., 2006) would have had a major impact on allele assignment. One of the major gaps seen in several studies is the lack of sequencing data for, at least, the three major population groups (Asian, African, and Caucasian) when new markers are proposed as usually only one group is analyzed. This approach reduces possible interpopulational variation and avoids genotyping problems when different groups are genotyped. This was the case for the first version of the most used X-STR commercial kit, the Investigator Argus X-12 (Qiagen). The markers in this kit were characterized mostly in individuals of European ancestry and therefore some of the genetic variations detected in other population groups were missed out (Tillmar et al., 2017). Once other population groups of other ancestries were studied, several markers presented high frequencies of silent alleles that had gone previously undetected (e.g., Tomas et al., 2012;Gomes et al., 2016Gomes et al., , 2017Tillmar et al., 2017). For example, the silent alleles for some of the loci were mostly caused by a mismatch at one of the primer binding sites (Gomes et al., 2016(Gomes et al., , 2017. After several reports on this matter, a new version was developed, the Investigator Argus X-12 QS (Qiagen), containing the same markers but with new primer designs for some of the X-STRs to resolve the high frequency of allele dropouts. Another example of inaccurate nomenclature assignment was the case of HPRTB. In the study of Pereira et al. (2007), peculiar results during population comparison analyses of a Northern Portuguese population sample with other European groups were found. These findings led to a deeper investigation, leading to the discovery of issues behind the HPRTB nomenclature (Szibor et al., 2009). In this latter report, authors described that two different nomenclatures were being used among the forensic genetic community, leading to a shift in allele frequencies and consequently errors in data resulting from population comparisons-based analyses (e.g., Pereira et al., 2007). Finally, as proposed by the ISFG recommendations on the use of X-chromosome markers (Tillmar et al., 2017), the previous recommendations on allele nomenclature already recognized for autosomal and Y-chromosomalspecific markers (Bär et al., 1997;Gill et al., 2001;Gusmão et al., 2006) can also be applied to X-STRs without the need for particular changes. It seems that very few studies take these recommendations into thoughtful consideration and no real significant advances have been made in this field. Accuracy in sequence variation and repeat structure and nomenclature of X-STRs are empirical and pending issues in forensic and population genetics research that are still often neglected.

THE USE OF X-CHROMOSOMAL MARKERS IN (COMPLEX) KINSHIP TESTING
The standard procedure to quantify the genetic evidence in kinship analyses relies upon independent autosomal markers and is grounded in Bayes' theorem. Typically, equal priors are considered, and a likelihood ratio (LR) comparing the probability of the observations assuming a pair of alternative, mutually exclusive, kinship hypotheses is computed (Gjertson et al., 2007). Indeed, autosomal information is the one generally considered, despite currently available X-chromosomal markers being able to provide great statistical power in some cases (Szibor et al., 2003;Krawczak, 2007;Szibor, 2007;Pinto et al., 2011aPinto et al., , 2013aGomes et al., 2012). From the set of the latter cases obviously excluded are those where there is a link "father-son" in both main and alternative hypotheses, as, for instance, in a "paternal grandfather-granddaughter" vs "unrelated" case analyzing a pair of individuals, as the first, when considering X-chromosomal transmission, equated to the second (Pinto et al., 2011a. In any case, the preference given to autosomal markers is easily justified and understood not only for allowing the same approach for each kinship problem, regardless of the sex of the involved individuals, but also because of independent transmission of the markers and, at least in most of the populations, absence of LD. Conversely, the analysis of X chromosome markers offers little room to consider only independently transmitted loci, and thus recombination rates and haplotype frequencies are in general required for statistical evaluation of the evidence. Non-random association of alleles of different loci at a population-level LD (also known as gametic association) can result from population events like drift, selection, non-random mating, or admixture (Hedrick, 1987;Medina-Acosta, 2011). A close physical location of the markers, as well as population stratification, will influence the reestablishing of equilibrium. Consequently, LD results neither can be extrapolated from one population to another, nor are stable, even in a closed population, as recombination progressively breaks it. Moreover, haplotype frequencies cannot be inferred from allelic ones, and direct counting needs to be carried out.
Closely located markers are said to be in linkage if they are more prone to be inherited together, as a unit, than independently. Linkage between markers depends on chromosomal recombination rate (or frequency). Two markers are unlinked if recombination between them is expected to occur in each meiosis so that half of the gametic products would be recombinant and thus recombination fraction takes the value of 0.50. Obviously, linked markers are more prone to be in LD. Segregation analyses in one or multi generation family studies were performed, aiming to estimate recombination rates between X-STRs of interest through proper bioinformatic pipelines that take into account the possibility of mutation (Nothnagel et al., 2012;Diegoli et al., 2016;Bini et al., 2019), but population-based studies, as HapMap project (The International HapMap Consortium, 2007), can also be considered (Phillips et al., 2012). Mapping functions as Haldane's (Haldane, 1919) or Kosambi's (Kosambi, 1944) are used to convert genetic distances between markers in recombination rates. It is however noteworthy that in some kinship problems, as the one involving a pair of females and the hypotheses maternity and unrelated, the linkage is not needed to be taken into account as it cancels in the LR numerator and denominator (Tillmar et al., 2017). A general framework to understand in which case linkage has to be considered is still lacking, despite being known that disregarding it may lead to a significant over-or underquantification of the genetic evidence (Tillmar et al., 2011;Kling et al., 2015b).
Contrarily to what occurs for autosomes, where a plethora of markers from 22 chromosomes can be chosen, linkage and LD are unavoidable issues in the case of X-chromosomal analysis. Due to the length of the X chromosome, a maximum of four unlinked X-STRs are estimated to be liable of being simultaneously analyzed. On the other hand, higher LD values are expected for X-chromosomal markers than for autosomes since recombination only occurs in female meioses, which have also smaller mutation rates than males (Shimmin et al., 1993;Schaffner, 2004). Finally, it should be noted that estimates of haplotype frequencies are not as accurate as the allelic ones since much larger databases are required: just considering a simple illustrative example, a set of three loci with 10 alleles each can potentially entail the estimation of 1,000 haplotype frequencies.
Few software packages are available for kinship evaluations considering X-chromosomal transmission, FamLinkX being the most relevant, taking into account the possibility of mutation, linkage, and LD (Tillmar et al., 2011;Kling et al., 2015a). Also, software to weigh the a priori power of a marker to exclude a claimed relationship was already developed (Egeland et al., 2014), and the ISFG recently provided general guidelines for using X-chromosomal markers in kinship testing (Tillmar et al., 2017).

Kinship Testing and the Identity-by-Descent Framework
Considering a number of generations beyond which individuals are assumed to be unrelated, kinship measurements are based on the concept of identity-by-descent. Two alleles are called identical-by-descent (IBD) if they are copies of a given ancestral allele. Barring mutation, two alleles which are identical by descent must be therefore identical-by-state (IBS). For autosomal transmission, nine IBD partitions can be established considering the four alleles of a pair of individuals and their relationship (Jacquard, 1974;Weir et al., 2006;Pinto et al., 2010). This number reduces to three if non-inbred individuals are considered, likewise occurring for X-chromosomal transmission between a pair of females (Pinto et al., 2011a. Regarding X-chromosomal transmission, there are four IBD partitions involving a female-male pair (two if assuming a non-inbred female) and two for a pair of males (Pinto et al., 2011a). Independently of the mode of genetic transmission considered, the probabilities of the genotypic observations, assuming a specific hypothesis of kinship, depend on the IBD probabilities of the pedigree and on the frequency of the alleles (Weir et al., 2006;Pinto et al., 2011a). Pedigrees with the same IBD coefficients are said to belong to the same kinship class, as they are, theoretically, undistinguishable through the use of unlinked markers (Pinto et al., 2010. In Table 4, IBD probabilities are presented for a pair of non-inbred individuals considering autosomal and X-chromosomal modes of genetic transmission and a set of commonly analyzed relationships. Algebraic formulae for the probabilities of the observations, given the identity by descent partitions, can be found in Weir et al. (2006) and Pinto et al. (2010Pinto et al. ( , 2011aPinto et al. ( , 2012, respectively, for autosomes and X-chromosomal markers. Finally, it should be noted that, assuming X-chromosomal mode of transmission, relationships are not symmetrical as probabilities of IBD sharing may differ. For example, while a pair of paternal auntnephew does not share X-IBD alleles (being thus equated to unrelated from the X-chromosomal point of view), a pair of paternal uncle-niece shares one pair of IBD alleles with 50% of chance.
Regardless of the mode of genetic transmission considered, striking statistical results could be obtained when the sharing of IBD alleles is mandatory, unless mutation occurs, for one of the two kinship hypotheses considered. For example, in a standard paternity problem ("unrelated" as alternative hypothesis), the probability of sharing a pair of IBD autosomal alleles (and thus IBS, barring mutation) is one, under the main hypothesis, and null under the alternative. In cases with daughters, this is also true for X-chromosomal markers, providing a higher a priori paternity exclusion power than autosomal ones (Krawczak, 2007;Pinto et al., 2013a).
In some cases, as in disaster victim identification problems, specific kinship hypotheses cannot be established, and a broader measure of kinship can be established to weigh the degree of relatedness before specifying more detailed hypotheses. In these cases, the coancestry coefficient, i.e., the probability of selecting two IBD alleles when each one is randomly chosen from each individual, can be computed. In this case, the analysis of the X chromosome can be of major importance as, in all the cases where transmission is not interrupted by a "father-son" link, the expected IBD sharing is at least the same as for autosomes -see Table 5, since no randomness is possible in the X-allele of a male. Coancestry coefficients can be estimated through the genotypes of the individuals (Pinto et al., 2011b(Pinto et al., , 2013b and the combination of both types of genetic information can provide valuable insights on the genetic kinship linking the individuals.

Parenthood Testing
The X-chromosomal markers can be used to complement autosomal information when inconclusive or weak statistical results are achieved in standard parenthood testing where the alternative hypothesis is the individuals being unrelated. This can be due to the poor quality or low quantity of DNA in degraded samples, resulting in few analyzed markers or to other, more complex, situations where few Mendelian incompatibilities are found.
Compared with autosomes, X-chromosomal markers provide greater statistical power in trios, in paternity duos with daughters, and in maternity duos with sons. The X-chromosomal markers are not informative in paternity cases with sons, and for mother/daughter duos, the same statistical power is obtained for autosomal and X-chromosomal transmission.
When few Mendelian incompatibilities are found, this can be due to the alleged parent of the child being related to the true parent. A relatively common situation is the alleged father being either a full brother or the father of the true father of the child, in which case the probability of the alleged father and child sharing a pair of IBD alleles is 50%. In a paternity testing with a daughter, if the alleged father is a brother of the real one, the probability of uncle-niece sharing a pair of IBD X-alleles is also 50%. In all the other cases, this probability is null. Indeed, the analysis of X-chromosomal markers can be an efficient approach for excluding close relatives of the real father, unknowingly presented in a standard paternity case .

Beyond Parenthood
In some cases, the alleged parent is not available for analysis, and sibship, or grandparenthood problems may emerge. In some of these cases, X-chromosomal markers can provide invaluable information, stronger than the one provided by autosomes. The most striking examples are those where the sharing of a pair of IBD X-alleles is mandatory. This occurs when the paternity of a daughter is questioned, being the alleged father unavailable for analysis, contrarily to his (unquestioned) mother or daughter. In both cases, the sharing of IBS alleles between analyzed females is mandatory for all the markers, unless mutation occurs. In these cases, the reached statistical power is the same for a paternity testing with autosomes when the alleged father is directly analyzed whether the mother of the child is available for analysis or not.
Another illustrating example is the kinship problem where the hypotheses are "full sisters" versus "unrelated." TABLE 4 | Probability of two individuals sharing two, one, or no pairs of identical-by-descent (IBD) alleles, assuming a specific kinship for both autosomal (Aut) and X-chromosomal (X chr) modes of genetic transmission. Considering X-chromosomal transmission and the main hypothesis, females share either two or one pair of IBD X-alleles with the same probability: 50%. Assuming autosomal transmission, they may not share IBD alleles (with 25% of chance), such as occurs assuming they are unrelated (with 100% of chance). It is then expected that X-chromosomal markers provide stronger results than autosomes. This occurs in all the kinships where the transmission of the X chromosome is not interrupted due to its obligatory transmission between father and daughter, which allows the skipping of one meiosis.

Incest Cases
In some cases, the high number of homozygosities shown by a child (e.g., in a paternity testing with alleged father excluded) may raise the suspicion of an incestuous situation. This may, under some circumstances, configure a crime (mother under age or with intellectual disability, for example). In the case of a daughter, X-chromosomal analyses may provide important insights even without analyzing the alleged father. If the father of the daughter is also the father of the mother and, in the absence of mutation, either the child is homozygous (for one allele present in the mother) or is heterozygous for the same alleles of the mother. In the case of autosomal transmission, three alleles can be seen in mother/daughter pair, as for the case of the parents being unrelated. The hypotheses of the father of the child being either the father or the full brother of the mother are theoretically indistinguishable when considering unlinked autosomal markers. Contrastingly, in the case of daughters, X-chromosomal markers can provide insights allowing the different weighing of the two hypotheses (Pinto et al., 2011a). 5 | Probability of choosing a pair of identical-by-descent (IBD) alleles when one allele is randomly chosen from each individual. Numbers in superscript in header refer to the sex of the individuals represented in genealogies.

Distinguishing Pedigrees Belonging to the Same Autosomal Kinship Class
Pedigrees are theoretically indistinguishable, considering unlinked markers, whenever they have the same IBD partitions (Pinto et al., 2010). This is the case of the second-degree relatives: avuncular, half-siblings and grandparent-grandchild, as the probability of individuals sharing two pairs of IBD alleles is null, while the probability of sharing one pair of IBD autosomal alleles is equal to the probability of sharing none (50%)see Table 4. Nevertheless, the analysis of X-chromosomal markers can provide differential weighing favoring one of the alternative hypotheses (Pinto et al., 2011a). For example, when a pair of females is analyzed, maternal and paternal aunt/niece can be distinguished from, respectively, maternal and paternal half-sisters and grandmother-granddaughter, which are not distinguishable among them even when considering X-chromosomal markers. In all the cases, females cannot share two pairs of IBD alleles, but a pair of maternal aunt/niece shares one pair of IBD alleles with a probability equal to 75%, while for both maternal half-sisters and grandmother-granddaughter pairs, this probability reduces to 50%. On the other, if both pairs of paternal half-sisters and grandmother-granddaughter have to share one pair of IBD alleles, this probability drops from 100 to 50% in the case of paternal aunt/niece. Different IBD probabilities will result in different weighing of the evidence, depending on the genotypic observations.

SEGREGATION STUDIES: CURRENT DATA AND MISSING DATA
The high power of discrimination that characterizes STRs and makes them desirable genetic markers compared to SNPs or INDELs, particularly in human identification analysis (such as kinship testing), is due to their higher mutation rate. An STR is, by definition, a tandemly arrayed repetition of a DNA fragment of one to six base pairs. There is general consensus that these are created by random mutations (Levinson and Gutman, 1987;Schlötterer, 2000). Generally, STRs with four base pairs motifs are plentiful and more stable than two or three nucleotide repeats; hence, they have been favored when designing the commercially available forensic kits (Pereira and Gusmão, 2016). Motifs with two or three base pairs are less stable and have a higher propensity for stutter during PCR, and STRs with more base pairs are less frequent. When a somatic mutation occurs, it affects only cell lines of the individual where it occurred. However, when a mutation occurs in the germ line, it has the potential of being passed on to the offspring and resulting in different parental and filial alleles. Mutation rates vary between types of polymorphisms and also on inherent individual characteristics such as sex and age Nachman and Crowell, 2000). Polymerase template slippage is thought to be the primary mutational mechanism leading to changes in STR length (Schlötterer and Tautz, 1992;Strand et al., 1993), and mutations involving the loss or gain of one repeat are assumed to be preponderant over mutations involving the loss or gain of multiple repeats. Slippage occurs during DNA replication when the two DNA strands come apart. When misalignment occurs out of register the repeat number of the STR product will be different. The currently accepted mutational model, also known as the stepwise mutation model (SMM) (Ohta and Kimura, 1973) occurring as a result of DNA replication slippage, includes mutational forces working in opposite directions: polymerase template slippage and point mutations; the latter reduce the length of STRs due to the breakage of the original segment creating two new shorter segments. Studies have shown that the longer the allele length, the higher is the frequency of these events. It has also been reported that longer alleles tend to mutate to shorter alleles and vice versa, while intermediate-sized alleles have approximately the same tendency to shorten or lengthen (Primmer et al., 1996;Brinkmann et al., 1998;Xu et al., 2000;Antão-Sousa et al., 2019).
In forensic casework context, the estimation of mutation rates is crucial for the analysis, interpretation, and quantification of experimental data and for the proper quantification of LRs. In such scenarios, the detection of mutation(s) has practical consequences in the interpretation of the genetic profiles. Some studies have addressed this by analyzing different familial configurations, familial duos, mother-son, mother-daughter, and father-daughter, and familial trios, father-mother-daughter (e.g., Jin et al., 2016;Burgos et al., 2019;García et al., 2019). Supplementary Table 3 presents the most updated information on mutation rates per marker and per familial configuration for the most commonly used X-STRs. To date, not much research on the mutation rates of the most commonly used X-STRs has been given, and therefore, data collection and analyses are still lacking. Perhaps one of the limitations in the estimation of mutation rates of STRs, in general, is the use of the (most frequently used) method for mutation estimates based on direct pedigree analysis. This means that mutated alleles are identified straightforward by the observation of allele transmissions in parent-child requiring a large amount of data to reliably estimate allele mutation rates. Having access to a high number of specific constellations of families may be a drawback to the (accurate) estimation of mutation rates of X-STRs.

Factors Underlying the Relative Stagnation in X Chromosome Forensic Research
After an initial boom, forensic research interest on X chromosome markers has witnessed a decline as judged by the number of relevant publications: 2000 (6), 2001 (7), 2002 (11), 2003 (18), 2004 (27), 2005 (25), 2006 (40), 2007 (35), 2008 (42), 2009 (43), 2010 (19), 2011 (41) This implies that the practical forensic use of X chromosome is well below its potential and -what is most concerning -is that its use may be unsupported by research data and based on inadequately validated technical means and theoretically reduced or even incorrect analytical approaches. Enabling corrective actions demands therefore the identification of the causes of this slowing down of the forensically inclined research on X genetic markers. This fact has no parallel on the other sexual chromosome counterpart, the Y, to which a lot of attention is devoted, for example, by the STRbase (National Institute of Standards, and Technology [Nist], 2020) and has as well a very active dedicated site 6 , YHRD (2020), in contrast to the ChrX-STR.org 2, as mentioned previously (see text footnote 2).
In this section, we will analyze the putative change counteracting the loss of interest and analyzing the presumed reasons or factors justifying this situation, which, from our point of view, can be classified into four broad categories: (a) theoretical and/or analytical, (b) technical, (c) statistical, and (d) medical/ethical, to be detailed below.

Theoretical and Analytical Difficulties
The main obstacle to the correct use of X chromosome in forensics lies in the hybrid nature of its formal genetic model of inheritance, common to most mammals, with very few exceptions (Cortez et al., 2014;Matveevsky et al., 2017). Indeed, as presented in the section "Introduction, " this chromosome harbors two distinct modes of transmission: the diploid, autosomal style (corresponding to the socalled pseudoautosomal regions), two in humans, PAR 1 and PAR 2 (Flaquer et al., 2009) and the sex-linked haplodiploid (for the rest of the chromosome, known as X-specific), which, due to the single copy in males, does not recombine.
When addressing X-chromosome markers, we are referring to the X-specific located ones. Therefore, only these will be analyzed (although some confusions do sometimes arise and quite often the status of X specificity may be doubtful -see below the technical section).
Even so, the formal genetic model of transmission and the consequences at the level of population genetics seem to be poorly understood by the forensic community, as judged by a recent analysis of the literature (Ferragut et al., 2019). It was shown that in 60% of 52 analyzed publications, forensic parameters were computed as for autosomal markers, and the analysis of associations between alleles from distinct loci (LD) was generally deficient or erroneous. In fact, linkage and LD concepts, particularly important for the X chromosome since all markers are located on the same chromosome, are often a source of confusion and generally lead to misinterpretation or even non-consideration of LD results in many genetic studies. Most studies using X-STRs correctly test for the presence of significant association among pairs of loci (LD) but fail to estimate haplotype frequencies and probability calculations, accordingly, when significant association is found among markers as loci must be analyzed together and not as individual markers in such cases. In 2017, recommendations were provided by the DNA commission of the ISFG addressing exactly the issues behind the concepts of linkage and LD in cases of kinship testing using X-STRs and emphasizing that "Haplotype frequencies should be used for likelihood calculations when LD exists" (Tillmar et al., 2017).
Similar issues have also arisen with the assessment of conformity with Hardy-Weinberg equilibrium expectations. Quite symptomatically, the ChrX-STR.org 2 website (see text footnote 2, accessed on 02/05/2020) has posted: "Based on the review of December 2018, it has been decided in cooperation with the X working group to remove the PI calculation from this website." From an applicable point of view, one can add that one of the additional problems to justify the decrease of interest in X chromosome markers could be due to the low number of identification cases that request X-STR markers. Perhaps the troubles behind the implementation of a new system (financial cost and human resource training) which has a much more complex type of analysis when compared to the Y chromosome, for example, may not justify the need for the use of this system.

Technical Problems
Besides the genotyping problems, which may be transversal to all markers, irrespectively of the mode of transmission, sex chromosomes pose special difficulties due to their complex evolutionary history. In fact, apart from the PAR regions, X and Y chromosomes still keep substantial extensions of homologous regions, which obstruct the safe establishment of specificity for a marker, as well as its primers in case of PCR-based techniques. Particularly for recently X/Y transposed regions, this may constitute an (nearly) insurmountable obstacle (Lopes et al., 2004) as well as the dynamic state of the pseudoautosomal moving boundaries (Otto et al., 2011).

Statistical Issues
Most of the statistical problems (both at the descriptive levelparameter estimation level or hypothesis testing design or evidence quantitative evaluation) stem out of the theoretical flaws discussed above. Nonetheless, some are specifically empirical and are related to the haplodiploid specificity of the X chromosome: different sampling and estimation methods are required for each sex. Indeed, while haplotype frequencies can be estimated by simple counting in males, in females, they have to be inferred. Needless to say, simple haplotype frequency estimation requires prohibitively large sample sizes, growing exponentially with the number of loci involved (Amorim and Pinto, 2018).

Medical/Ethical Questions
To begin with, it must be highlighted that the very genotyping of sex chromosome markers for forensic purposes may represent a violation of some of the established recommendations and rules on the exclusion of any markers that can reveal physical traits [e.g., European Council Resolution of 25 June 2001 on the exchange of DNA analysis results (2001/C 187/01)]. Furthermore, gender and sex are always sensitive, and sometimes conflicting, categories in or for some individuals.
The evolutionary dynamics of sex chromosomes introduces also undesirable clinical and ethical problems. In fact, sex chromosomes are the Achilles' heel of male meiosis (Kauppi et al., 2012). A non-negligible proportion (1/448 live births) of the human population carries some sort of chromosomal aberration and, for example, one of the aneuploidies, Klinefelter syndrome, has an incidence of ∼1/500 male live births) (Nielsen and Wohlert, 1990). The consequences for forensic practice are ethically troublesome: discordance of external sex from X chromosome typing and unwilling disclosure of a clinical condition. In addition to X-chromosomal changes, several X-STR markers, that were or are still in use, have been linked to medical conditions. The HumARA is linked to spinal and bulbar muscular dystrophy (SBMA) as well as to other health risks (Szibor et al., 2005). Another example is the possible LD between the STR alleles at HPRTB locus to the X-linked recessive disorder Lesch-Nyhan syndrome (caused by molecular defects within the HPRT gene) (Mansfield et al., 1993). Some data have shown that inheritance of two polymorphic tandem repeats, one being the HPRTB locus (mapped within intron 3 of the HPRT gene), could be used to establish linkage to the disease (Mansfield et al., 1993).
The X chromosome has had an interesting journey in the last two decades in the research fields of forensic and population genetics by providing new (population) data and aiding in the clarification of several issues, namely, in kinship testing. Its particular properties of inheritance (recombination on the female side and haploid state on the male side) have allowed this chromosome a role that cannot be accomplished by the autosomes neither by its counterpart, the Y chromosome. After an initial bloom of publications, several multiplex developments, workshops at international meetings, creation of an X-STR database, the interest in X-chromosomal markers is gradually fading. Analytical and statistical issues may be the major underlined motivations to the lack of interest in addition to a lower demand of X-STR-based identification cases.
Considerable effort has already been put in X-STRs, namely, (i) the generation of allelic and haplotypic frequency databases that include a fair enough number of geographically different located populations; (ii) several in-house multiplexes containing a large number of highly polymorphic markers as well as a sound established commercial kit; and (iii) relevant number of studies addressing and recommending solutions for the main issues surrounding X-STR kinship-based testing. Therefore, this effort should not be lost and move toward the revival of the standing position of X chromosome markers in forensic genetics.

AUTHOR CONTRIBUTIONS
IG drafted the manuscript scheme and, in addition, all authors have made a substantial, direct, and intellectual contribution to the work, and approved the final version for publication.

FUNDING
This work was partially financed by FEDER-Fundo Europeu de Desenvolvimento Regional funds through the