A Novel SNP-STR System Based on a Capillary Electrophoresis Platform

Various compound markers encompassing two or more variants within a small region can be regarded as generalized microhaplotypes. Many of these markers have been investigated for various forensic purposes, such as individual identification, deconvolution of DNA mixtures, or forensic ancestry inference. SNP-STR is a compound biomarker composed of a single nucleotide polymorphism (SNP) and a closely linked short tandem repeat polymorphism (STR), and possess the advantages of both SNPs and STRs. In addition, in conjunction with a polymerase chain reaction (PCR) technique based on the amplification refractory mutation system (ARMS), SNP-STRs can be used for forensic unbalanced DNA mixture analysis based on capillary electrophoresis (CE), which is the most commonly used platform in worldwide forensic laboratories. Our previous research reported 11 SNP-STRs, but few of them are derived from the commonly used STR loci, for which existing STR databases can be used as a reference. For maximum compatibility with existing DNA databases, in this study, we screened 18 SNP-STR loci, of which 14 were derived from the expanded CODIS core loci set. Stable and sensitive SNP-STR multiplex PCR panels based on the CE platform were established. Assays on simulated two-person DNA mixtures showed that all allele-specific primers could detect minor DNA components in 1:500 mixtures. Population data based on 113 unrelated Chengdu Han individuals were investigated. A Bayesian framework was developed for the likelihood ratio (LR) evaluation of SNP-STR profiling results obtained from two-person mixtures. Furthermore, we report on the first use of SNP-STRs in casework to show the advantages and limitations for use in practice. Compared to 2.86 × 103 for autosomal STR kits, the combined LR reached 7.14 × 107 using the SNP-STR method in this casework example.

Various compound markers encompassing two or more variants within a small region can be regarded as generalized microhaplotypes. Many of these markers have been investigated for various forensic purposes, such as individual identification, deconvolution of DNA mixtures, or forensic ancestry inference. SNP-STR is a compound biomarker composed of a single nucleotide polymorphism (SNP) and a closely linked short tandem repeat polymorphism (STR), and possess the advantages of both SNPs and STRs. In addition, in conjunction with a polymerase chain reaction (PCR) technique based on the amplification refractory mutation system (ARMS), SNP-STRs can be used for forensic unbalanced DNA mixture analysis based on capillary electrophoresis (CE), which is the most commonly used platform in worldwide forensic laboratories. Our previous research reported 11 SNP-STRs, but few of them are derived from the commonly used STR loci, for which existing STR databases can be used as a reference. For maximum compatibility with existing DNA databases, in this study, we screened 18 SNP-STR loci, of which 14 were derived from the expanded CODIS core loci set. Stable and sensitive SNP-STR multiplex PCR panels based on the CE platform were established. Assays on simulated two-person DNA mixtures showed that all allelespecific primers could detect minor DNA components in 1:500 mixtures. Population data based on 113 unrelated Chengdu Han individuals were investigated. A Bayesian framework was developed for the likelihood ratio (LR) evaluation of SNP-STR profiling results obtained from two-person mixtures. Furthermore, we report on the first use of SNP-STRs in casework to show the advantages and limitations for use in practice. Compared to 2.86 × 10 3 for autosomal STR kits, the combined LR reached 7.14 × 10 7 using the SNP-STR method in this casework example.

INTRODUCTION
Microhaplotypes has been revealed the abilities in different forensic application purposes, including individual identification, mixture recognition (Chen et al., 2018;Oldoni and Podini, 2019;Oldoni et al., 2020), and ancestry inference (Bulbul et al., 2018;Chen et al., 2019). In our opinion, compound biomarkers consisting of two or more variants that occur within a small region, can be regarded as generalized microhaplotypes, and include single nucleotide polymorphisms (SNPs) closely linked to short tandem repeats (STRs) (SNP-STR), insertion and deletion polymorphisms (indels) closely linked to STRs (DIP-STR), indel polymorphisms closely linked to SNPs (DIP-SNPs), and several indel polymorphisms physically linked very tightly (multi-indels) (Castella et al., 2013;Wang et al., 2015;Wendt et al., 2016;Tan et al., 2017Tan et al., , 2018Qu et al., 2020). Among various markers, STRs are the most popular multiallelic markers used in forensics worldwide. Since they are highly polymorphic and discriminative among individuals, they were adopted as reference loci for the Combined DNA Index System (CODIS) and also facilitated the worldwide implementation of the crime National DNA Databases (NDNADs) (Hares, 2012(Hares, , 2015Karantzali et al., 2019). A variety of commercial capillary electrophoresis (CE)based autosomal STR amplification kits have been developed and used for human identification, kinship relationship, and mixture deconvolution Gymrek, 2017;Al-Eitan et al., 2019). SNPs are the most common type of genetic variation among the human genome and are also the most useful for studying human evolutionary history over long time scales (Agrafioti and Stumpf, 2007). Indels occur less frequently in the genome than SNPs. Furthermore, indel markers are almost unavailable around the forensic commonly used STRs, such as the expanded CODIS core loci STR set and the Extended European Standard Set (ESS) (Welch et al., 2012;Tan et al., 2018). Based on these reasons, SNP-STRs combine the advantages of both SNPs and STRs, provide more information than DIP-SNPs and SNP-SNPs, have more candidates than DIP-STRs, and offer the possibility of gaining better insights into population genetic processes.
Massively parallel sequencing (MPS) technology can directly detect each strand's allele combinations separately. However, the higher costs and lack of consistent nomenclature, reporting standards, and existing national DNA database infrastructure to support statistical calculations, impose practical challenges for the introduction of MPS for routine forensic procedures (Just and Irwin, 2018). Capillary electrophoresis (CE) platforms are still the most commonly used strategy in forensic laboratories.
Our previous research reported 11 SNP-STR markers based on a CE platform (Tan et al., 2018). Furthermore, in conjunction with a polymerase chain reaction (PCR) technique based on an amplification refractory mutation system (ARMS), SNP-STRs can be used to target minor DNA at an excess of 100-fold of major DNA, providing a powerful method to analyze unbalanced DNA mixtures. The successful detection of cell-free fetal DNA in the peripheral blood of pregnant women using this method in our previous study demonstrated the effectiveness of this method for forensic unbalanced DNA mixture analysis . However, the linked STRs of the 11 SNP-STRs are uncommon in the expanded CODIS core loci STR set, ESS, or other forensic commonly used STR kits.
In order to achieve maximum compatibility with existing DNA databases, and further explore the capacity of SNP-STR markers to help analyze unbalanced DNA mixtures, we aimed to develop further SNP-STR loci based on commonly used STR loci. Eighteen novel SNP-STRs based on the expanded CODIS core loci set and other commonly used STR kits were developed. Stable and sensitive SNP-STR multiplex PCR panels based on the CE platform were established. The capabilities of the SNP allele-specific ARMS primers of these loci were evaluated using simulated binary unbalanced mixtures. Forensic parameters were estimated based on a survey of 113 individuals from the southwest Chinese Han population. Furthermore, as the likelihood ratio (LR) has become the most commonly used method by forensic communities to determine the weight of evidence , a probabilistic framework for LR calculation for the SNP-STR profiling results of twoperson mixtures was developed in this study. Finally, we report on the first use of these markers in a case study to show the advantages and limitations of SNP-STRs for forensic practice.

GENETIC BACKGROUND
In our previous study, we developed SNP-STRs with ARMS technology to analyze unbalanced DNA mixtures (Wang et al., 2013(Wang et al., , 2015. Figure 1 shows the design principle of the SNP-STR ARMS primers. The principle of this technique is to design two forward allele-specific primers that can target two different allele sequences of a SNP, respectively. SNP-STR ARMS primers require the 3 -end nucleotide of the two forward primers to be SNP allele-specific and the second or third nucleotide from the 3 -end to be changed to create a mismatch. The mismatch is introduced to ensure that the matched SNP-specific primer is refractory to PCR on the "mismatch" template (no amplification products obtained), but can still anneal to the complementary "match" template to complete PCR amplification (SNP allelespecific products obtained). A common reverse primer is located downstream of the STR core region sequence. Both SNP and STR genotypes can be obtained using a single PCR.
Single nucleotide polymorphism-STR allele-specific primers allow for the selected amplification of the minor contributor's genotype, as long as it has alleles that are absent in the major contributor's genotype. For a two-person unbalanced mixture, the possible genotype configuration of the two contributors is represented by four scenarios (see Figure 2; Oldoni et al., 2015): (i) "informative genotype 1, " both the SNP genotypes of the major and minor DNA component are homozygous for the alternate allele, with a probability of M 2 N 2 + M 2 N 2 (M and N represent SNP allelic frequencies of a SNP-STR locus); (ii) "informative genotype 2, " the SNP genotype of the major DNA component is homozygous and the minor DNA component is heterozygous, with a probability of 2M 3 N + 2MN 3 . (iii) "informative genotype 3" represents the configuration when both the contributors are FIGURE 1 | The principle of allele-specific primer design for SNP-STR markers using the ARMS technique. Each of the two allele-specific SNP primers is labeled with different fluorescent dyes. The common reverse primer is located downstream of the STR core region sequence. The obtained allele peaks are detected using capillary electrophoresis (CE), with the resultant product peaks containing two genotypes: (1) the color represents the SNP allele; (2) the size corresponds to the STR allele.
homozygous-MM, no alleles will be observed if we analyze the trace sample using the N-primer. The probability is M4 + N4; and (iv) when the SNP genotype of the major DNA component is heterozygous-MN, no specific SNP allele for the minor DNA component exists, which is called the "uninformative genotype." The probability is 2M 3 N + 4M 2 N 2 + 2MN 3 .
Under the situation of "informative genotype 1" and "informative genotype 2, " the SNP-STR allele-specific primer could be used to specifically target the minor DNA component from the mixture. Hence, the probability that the minor DNA component could be specifically targeted is M 2 N 2 + M 2 N 2 + 2M 3 N + 2MN 3 , which can be defined as the I value (probability of informative genotypes). This value can be used to assess the capability of the SNP-STR marker to target the minor DNA components in binary DNA mixtures that contain high background levels of major DNA components (Oldoni et al., 2015).

Ethics Statement
The participants provided written informed consent to participate in this study, and for participants under the age of 16 years, the legal guardian provided written informed consent to participate. No children under the age of 8 years participated in the study. All samples were obtained under the supervision of the Ethical Committee of Sichuan University (KS2019042).
Official permission to further analyze crime-related DNA samples and to present the corresponding data was obtained by the HI-TECH Industrial Sub-branch of Chengdu Municipal Public Security Bureau, Sichuan, China.

Samples and DNA Preparation
Peripheral blood samples were collected from 113 unrelated Chinese Han individuals in Chengdu, China. Casework DNA samples from a sexual assault were used to report the first application of SNP-STR markers. A vaginal swab of the victim (No. 831-9) and reference blood samples of the victim (No. 831-1) and suspect (No. 831-2) were collected by the HI-TECH Industrial Sub-branch of Chengdu Municipal Public Security Bureau, Sichuan, China.
Genomic DNA was isolated using a whole blood extraction kit (BioTeke, Beijing, China) according to the manufacturer's instructions and quantified spectrophotometrically using a NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, United States).

SNP-STR Candidate Screening
Single nucleotide polymorphism-STR loci were selected based on commonly used worldwide STRs, including those from the expanded CODIS core loci set, as well as other STRs available in commercial kits, such as the GlobalFiler PCR Amplification Kit (Thermo Fisher Scientific) and AGCU Expressmarker 22 kit (AGCU ScienTech Incorporation, Wuxi, Jiangsu, China) (Hares, 2015;He et al., 2017). The screening criteria were as follows: (i) SNP minor allele frequency in Han Chinese in Beijing, China (CHB), and Japanese in Tokyo, Japan (JPT), or East Asia (EAS) populations higher than 0.02, namely I > 0.04; (ii) SNP located within 400 bp of STR region; and (iii) targeted amplicons shorter than 550 bp.
The STRs mentioned above were searched for using the University of California Santa Cruz (UCSC, Santa Cruz, CA, United States) Genome Browser. SNPs in the flanking regions of STRs were filtered by dense "Base Position" and pack "STS FIGURE 2 | Possible genotype sets of the two contributors in a two-person mixture (A) and the probability table for all possible combinations (B). The orange regions represent informative genotype 1 and the green regions represent informative genotype 2. Gray and blank represent informative genotype 3 and uninformative genotype, respectively. M and N in (B) represent the biallelic SNP allele frequency of an SNP-STR locus. The probability of informative genotypes for one SNP-STR locus is: I = M 2 N 2 + M 2 N 2 + 2M 3 N + 2MN 3 = 2M 2 N 2 + 2M 3 N + 2MN 3 .
Markers" in "Mapping and Sequencing" and full "Common SNPs (147)" with MAFs between 0.02 and 0.5 in "Variation." All other options were hidden. The selected SNPs were confirmed to meet the screening criteria using the dbSNP database (NCBI, NIH, Bethesda, MD, United States). The SNP with the largest MAF or located closest to the STR was used in the following steps when there was more than one SNP around the STR locus.

Primer Design and Multiplex Establishment
General forward primers with the 3 end starting with the SNP and a reverse primer downstream of the STR for one SNP-STR locus were designed using the Primer3 web version 4.0.0 1 . Allele-specific primers were designed by introducing a deliberate mismatch at the antepenultimate or penultimate base at the 3' end of the forward primers. Two allele-specific forward primers and one standard reverse primer were obtained for each SNP-STR locus.
The SNP-STR multiplex was established to determine genotypes based on amplicon length and fluorophore label for each locus. The concentration of each primer was adjusted according to the capillary electrophoretic profiles of the amplification products generated by the primer mix reactions.

PCR Conditions and Genotyping
Singleplex PCR amplifications were performed in 10 µL reactions containing 5 µL Multiplex PCR Mix (Qiagen, Hilden, Germany), 0.5 µL primer mix (one forward primer and one reverse primer, 3 µM of each), 3.5 µL nuclease-free water, and 1 µL genomic DNA (1 ng/µL). Thermocycling was performed in a Mastercycler Nexus Gradient (Eppendorf, Hamburg, Germany) under the following conditions: initial denaturation at 95 • C for 15 min; 31 cycles of 30 s at 94 • C, 90 s at 58 • C, and 60 s at 72 • C; followed by a final extension step of 30 min at 60 • C. The reaction and cycling conditions for the multiplex PCR amplifications using the primer mix were the same as those used for singleplex amplifications.
The PCR products were separated and detected by capillary electrophoresis using a 3130XL Genetic Analyser (Applied Biosystems, Foster City, CA, United States) by adding 1 µL of product to 9 µL of a 40:1 mixture of HiDi GeneScan 600 LIZ Size Standard (Thermo Fisher Scientific). Electrophoretic conditions Chr. Location: the location of the SNP-STR on the chromosome; distance: number of bases between SNP and STR core area; SNP Allele: allele of SNP in SNP-STR locus; SNP MAF: minor allele frequency of SNP in dbSNP; CHB: Han Chinese in Beijing, China; JPT: Japanese in Tokyo, Japan; EAS: East Asia; I value: probabilities of informative genotypes for SNP-STR calculated using SNP frequencies from MAF.
were 9 s at 3 kV for injection and 1,500 s at 15 kV for the run. The initial data were analyzed using GeneMapper ID-X v1.2 software (Thermo Fisher Scientific), with the peak height threshold for each fluorescent dye set at 50 RFU. A total of 113 DNA samples were profiled using the primer mix for the SNP-STR multiplex. Amplification of the positive control was performed using 1 ng '9948-control' DNA (Health Gene Technologies, Ningbo, China). The negative control was a no-template control (NTC) consisting of ddH 2 O.

Specificity of Primers
Forward primers (data not show) for Sanger sequencing were based on SNP-STR allele-specific primers for amplification using the reverse primers and obtain products containing SNP and STR loci between reactive primers. Three samples with SNP genotypes that were homozygous for both alleles and one heterozygote for each SNP-STR locus were amplified using sequencing primers. The resultant PCR products were sequenced from both directions using the Sanger method (sequencing performed by Invitrogen, Shanghai, China) after being recycled from polyacrylamide gel electrophoresis-separated bands. STR genotypes in the SNP-STR multiplex for some of the tested samples were confirmed using commercial STR kits (AGCU Expressmarker 22 kit and GlobalFiler PCR Amplification Kit).

Sensitivity of PCR Assays
DNA samples with heterozygous SNP allele calls for each SNP-STR were selected to assess the minimum amount of template required for each allele-specific primer to obtain a positive profile. DNA samples diluted to 0.1, 0.05, 0.025, 0.01, and 0.005 ng/µL were amplified using the singleplex PCR conditions described in section Genotyping SNP-STRs.

Stutter Percentage
Peaks that were one repeat smaller or larger than the true allele (±0.5 bases) were considered stutter peaks. The stutter percentage of each primer was calculated by dividing the stutter peak height (n ± 1 repeat unit) by the associated allele peak height with 113 unrelated Chinese Han individuals. Peak height threshold of stutters was set to 20 RFU.

Simulated Mixture Detection
To assess the capability of each allele-specific primer to target the minor DNA components in extremely unbalanced binary mixtures, mixtures with a series of different ratios (1:50, 1:100, 1:200, and 1:500) were made (the amount of minor DNA template was fixed at 0.05 ng). For all the simulated mixtures, the major contributor was homozygous for a given SNP, while the minor contributor was heterozygous. Some of the DNA mixtures were genotyped using commercial STR kits (GlobalFiler PCR Amplification Kit) for comparison.

Casework Example
It is important to note that for SNP-STR analysis of the mixture, the only information needed from the main contributor's DNA is its SNP-heterozygosity or homozygosity. Suppose a mixed trace sample was collected from the victim, DNA of the victim and the suspect was both available. For SNP-STR analyses under this scenario, the steps were as follows: (i) amplify the reference DNA of the victim and suspect using the SNP-STR multiplex; (ii) select informative markers for which the major DNA contributor Frontiers in Genetics | www.frontiersin.org R in "Primer name" column: shared reverse primer; F-A/G/C/T in "Primer name" column: SNP A/G/C/T allele specific forward primer; in "Primer sequence" column, the underlined capital letters are the bases of allele-specific primers matched to SNP, the bold lowercase letters are bases mismatched to original sequence; 5 Dye: the fluorophore label at the 5 -end of SNP allele-specific primer; Panel: the SNP-STR multiplex panel designation of the SNP primer (A, B, or C); Amplicon size: length range for target amplicons.
is SNP homozygous. The primers specific to the opposite SNP alleles are then used to target its amplification from trace samples; and (iii) assess the LR value of the DNA results.
For the traditional method, autosomal STR profiles of the trace sample and reference samples were obtained using the AGCU Expressmarker 22 kit (AGCU ScienTech Incorporation, Wuxi, Jiangsu, China) according to the manufacturer's instructions. The SNP-STR genotype of the reference DNA was tested through SNP-STR multiplex PCR amplifications. Detection of the selected informative markers from the trace sample was based on Singleplex PCR amplifications. The reaction and cycling conditions, as well as the electrophoretic condition and data analysis method were the same as described in section PCR Conditions and Genotyping. A threshold of 50 RFU was used for peak calling.

General Parameters
Allele frequencies of each SNP-STR locus and corresponding SNPs were calculated for all 113 individuals. Exact tests for Hardy-Weinberg Equilibrium (HWE) as well as for linkage disequilibrium between all pairs of markers in the multiplex were performed using ARLEQUIN statistical software v3.5 (Excoffier and Lischer, 2010). To evaluate the detection performance of SNP-STRs for minor DNA in binary mixtures, the probabilities of informative genotypes (I) were estimated, using the formula I = 2M 2 N 2 + 2M 3 N + 2MN 3 (Figure 2), in which M and N are the frequencies of the M and N alleles in the 113 individuals, respectively. The average I value for the SNP was calculated and used to estimate the probabilities that SNP-STR markers (at least 5 and 10 loci) would be informative based on the cumulative binomial distribution of 40 trials (markers). Forensic parameters of SNP-STR and STR loci were assessed by calculating the matching probability (MP) and power of discrimination (DP) using Powerstats v1.2 (Promega, Madison, WI, United States).

LR Evaluation for Casework
Likelihood ratio is an established method that evaluates two contrasting hypotheses (prosecution vs. defense) (Gill and Haned, 2013;Dørum et al., 2014;Benschop et al., 2015;Slooten, 2017). Cereda et al. built an object-oriented Bayesian network (OOBN) to perform LR computation for DIP-STRs, and a series of casework DNA samples has been reported using this model (Cereda et al., 2014;Oldoni et al., 2017). Similar to the DIP-STR, the principle of SNP-STR used for two-person mixtures is based on the selected amplification of the minor contributor's genotype. In this study, we constructed a similar Bayesian network in a Python environment to obtain LRs for particular SNP-STR profiling results. More detailed descriptions of the model are contained in the Supplementary Material. This model was constructed and the calculations were performed using pgmpy, a Python library for working with probabilistic graphical models 2 .
Likelihood ratio values for autosomal STR typing results were calculated using EuroForMix (Bleka et al., 2016). Population allele frequencies published by He et al. and Xin et al. were used for calculations (He et al., 2017;Adnan et al., 2018). LRs for SNP-STR profiling results were calculated using the abovementioned Bayesian model. A detailed description is provided in the Supplementary Material. The SNP-STR allele frequencies used for calculations were derived from population research of 113 unrelated individuals in this study.

Information on SNP-STRs
Nineteen SNP-STRs were identified that met the screening criteria ( Table 1). The I value of these loci, calculated using the MAFs in dbSNP, were in the range of 0.123-0.375. All SNPs of the 19 SNP-STRs were verified to be biallelic, according to the dbSNP database.

Primers and Multiplex Panels
Two SNP allele-specific forward primers with mismatches introduced using the ARMS technique and one reverse primer were obtained for each SNP-STR locus. All target amplicons were shorter than 522 bp. High amplification efficiency was achieved with primers designed for 18 of the 19 SNP-STRs meeting the screening criteria. The rs11063971-VWA locus was excluded due to poor amplification. Details are listed in Table 2.
Three multiplex panels for 18 SNP-STRs were established: Panel-A, Panel-B, and Panel-C (Figure 3). The concentrations of each primer in the corresponding panels are listed in

Specificity Tests
To examine the specificity of SNP-STR primers, samples were selected from the 113 individuals and analyzed using Sanger sequencing. The SNP and STR genotypes of these samples were concordant with the Sanger results. Supplementary Figure 1 shows the Sanger sequencing data for a sample with the genotypes A9/C9 at rs9531308-D13S317. The GlobalFiler PCR Amplification Kit (Thermo Fisher Scientific) plus the EX22 STR Kit, containing all of the 18 STRs in our SNP-STR system, were used to test the STR profiles for the selected samples. All of the STR profiling results were consistent with our SNP-STR multiplex results (data not shown).

Simulated Mixture Analysis
The simulated mixture tests required that the major DNA component was SNP homozygous for a given SNP-STR  locus. Four primers (rs25768A-D5S818, rs8031604T-Penta E, rs7786079C-D7S820, and rs6736691A-D2S1338) were not tested because there were no homozygous genotypes for these SNPs according to the survey of 113 individuals. Among the 32 allele-specific primers tested, 26 successfully amplified the minor DNA without a masking effect from the major DNA in the mixtures at ratios from 1:50 to 1:500, three (rs4847015C-D1S1656, rs7962284T-D12S391, and rs2246512G-D10S1248) did so at ratios from 1:50 to 1:200, two (rs9362476C-SE33 and rs25768A-D5S818) did so at ratios from 1:50 to 1:100, and one (rs2070018T-FGA) did so at a ratio of 1:50 ( Figure 5B). Supplementary Figure 3 shows examples of the detection performance in unbalanced DNA mixtures for SNP-STRs. Supplementary Figure 4 shows examples of simulated DNA mixtures analyzed using a conventional STR kit.

Stutter Percentage
Owing to confusion with the minor contributor, a high percentage of stutter may make the analysis of the mixture more difficult. In this study, the stutter percentage was calculated by dividing the stutter peak height (n ± 1 repeat unit) by the associated allele peak height with 113 samples. The results of the average stutter percentage and SD for each locus are listed in Supplementary Table 2 and the average stutter plus three standard deviations were used to set the stutter filter threshold. The lowest average percentage of stutter was observed at locus rs58390469A-D2S441 (1.68%) and the highest was rs7962284C-D12S391 (9.16%). The recommended stutter filters would be useful for mixture analysis.

SNP-STR Performance Assessment
Haplotype frequencies are listed in Supplementary Table S3.
The locus with the minimum number of haplotypes was rs13413321-TPOX and that with the maximum was rs9362476-SE33. The SNP minor allele frequencies in the SNP-STRs ranged from 0.487 (rs13413321-TPOX) to 0.022 (rs7786079-D7S820) ( Table 3). Observed heterozygosity, expected heterozygosity, and the probability values (p) of the HWE test are listed in Supplementary Table S4. The p-values for rs17651965-CSF1PO, rs6736691-D2S1338, and rs9362476-SE33 were all less than 0.05. There was no significant linkage disequilibrium among the SNP-STR combinations located on the same chromosome after Bonferroni correction (p < 0.0003). The probabilities of informative genotypes (I) for each informative genotype combination, which were calculated using the observed SNP MAF to estimate the probability of detecting minor DNA in a mixture, are given in Table 3. On average, the current 18 SNP-STR markers demonstrated I values of 0.0724, 0.202, and 0.379 for three kind of informative genotype combinations, and a combined I value of 0.747, 0.984, and 0.999989, respectively, which means there is an approximate 74.7% probability of obtaining at least one informative genotype 1 marker of the minor contributor's DNA in a two-person mixture using this set of SNP-STR loci, and 98.4 and 99.9989% for informative genotype 2 and informative genotype 3, respectively. The results, shown in Table 4, indicate that if there are 40 SNP-STRs with comparable I value, 95.05% of DNA mixtures will have at least one informative genotype 1 marker, 92.83% will have at least five informative genotype 2 markers, and 97.06% will have at least ten informative genotype 3 markers based on the cumulative binomial distribution.
Forensic parameters containing match probability and power of discrimination of 18 SNP-STRs and 18 STRs are listed in Table 5. The match probability of the SNP-STRs ranged from 0.016 (rs9362476-SE33) to 0.185 (rs13413321-TPOX) and that of the STRs ranged from 0.016 (SE33) to 0.239 (TPOX). The power of discrimination of the SNP-STRs ranged from 0.815 (rs13413321-TPOX) to 0.984 (rs9362476-SE33) and that of the STRs ranged from 0.761 (TPOX) to 0.984 (SE33).

Traditional STR Profiling Results of the Casework
According to SWGDAM guidelines, if one or more loci have three or more alleles present, excluding tri-allelic loci, then the sample is assumed to be a mixture (SWGDAM, Accessed 6 November 2017). The autosomal STR profile of the trace sample had a maximum of four alleles at only one locus (vWA). Three loci (D3S1358, D2S1338, and D12S391) were shown to have three alleles, respectively. It can be inferred as a two-person mixture based on the maximum allele count (Dembinski et al., 2018). The STR profile of the trace sample showed that most alleles, even most of the alleles with a higher peak, correspond to the victim, indicating that the victim acts as the major component of this mixture. The combined LR

SNP-STR Profiling Results of the Casework
In this case study, the SNP genotypes of the victim and suspect constituted one locus of informative genotype 1, eight loci of informative genotype 2, six loci of informative genotype 3, and three loci of the uninformative genotype. For the trace sample, all informative alleles were successfully detected by using allele-specific primers targeting minor contributor's alleles. The loci that belonged to informative genotype 3 showed no peaks, as expected. The SNP-STR alleles of the trace sample and suspect corresponded to all of these markers. The average and combined LR values of SNP-STR informative alleles for this casework were calculated using the Bayesian model mentioned previously, and the results are shown in Table 6. The combined LR was obtained by multiplication because the SNP-STR markers are assumed to be independent. The combined LR reached 7.14 × 10 7 .
Combination: possible genotype sets of the two contributors in a two-person mixture. Trace sample: profiling results of the trace sample (No. 831-9) by using allele-specific primers targeting informative alleles. NR represents no alleles was obtained.

DISCUSSION
Our laboratory has previously reported 11 SNP-STRs, but few of the linked STRs are derived from commonly used databases (Tan et al., 2018). The discriminatory power for mixture analysis requires more loci for which more informative markers may be obtained. In this study, we aimed to develop more SNP-STR loci based on commonly used STRs, establishing a connection between SNP-STR markers and existing STR databases. Eighteen SNP-STR loci were screened here, of which 14 were derived from the expanded CODIS core loci set. Three loci (D6S1043, Penta D, and Penta E) are available in the AGCU Expressmarker 22 kit. We also included the most informative tetranucleotide loci studied to date, SE33, which is included in various CE-based kits (GlobalFiler, NGM Select, ESSPlex, PowerPlex ESI/ESX 17). SE33 is a core locus for the German National DNA Database (DAD) and has also been adopted by other laboratories in Europe (Butler, 2012). As described above, the capability of an SNP-STR assay to target the minor DNA component of a binary DNA mixture with high background levels of the major DNA component can be assessed using the I value, which is based on the minor allele frequency of the SNP (Tan et al., 2018). Since the SNP-STRs we screened were limited to commonly used STRs, we set the minimum SNP MAF filter to 0.02 to obtain more loci. There are far fewer indels than SNPs across the genome. During screening of SNP-STR candidates, we found that almost no indel loci were located near these STR loci. DIP-STR loci with I > 0.04, based on the 24 commonly used STRs, cannot be accessed.
The purpose of establishing a multiplex system is to facilitate the investigation of population genetics and its application to forensic casework. Due to limitations in the number of loci and the length of the target amplicons, all primers couldn't be mixed into a single panel. All 36 allele-specific SNP primers of the 18 SNP-STRs were eventually combined into three panels. Compared with singleplex profiling, the analytical efficiency was still greatly improved. Two sets of primers for each SNP-STR locus must be considered to obtain each sample's genotype, increasing the complexity of data analysis. Furthermore, manual data handling might introduce errors. The development of integrated software for SNP-STR genotyping data, as a follow-up study, would greatly simplify the analysis and improve its overall efficiency and accuracy.
No differences were found between the Sanger sequencing and SNP-STR profiling results. Conventional STR kits were used to compare the STR genotypes of DNA samples obtained by the SNP-STR primers, and no differences were found. These results highlight the specificity of the SNP-STR multiplex system established in this study.
For the sensitivity analysis, all SNP-STR primers in this study successfully amplified DNA using 0.05 ng of template per reaction. The sensitivity was equivalent to that of DIP-STR assays (minimum detection limit was 0.03-0.1 ng). DIP-STRs also enable the specific detection of minor DNA even when the amount of major DNA is 1,000-fold higher (Castella et al., 2013;Oldoni et al., 2015). However, as the results of the simulatedmixtures test showed, compared with the allele-specific primers of DIP-STR markers, ARMS primers of SNP-STR markers with 1-3 consecutive or discontinuous base mismatches may not be able to achieve similar specificity due to interference from major DNA. Furthermore, mutual interference among primers exists in the multiplex. Singleplex SNP-STR primers are recommended for targeting informative alleles from mixtures.
The number of alleles at each locus was increased by the combination of STRs and SNPs and unrelated to the MAF of the SNP. According to a population survey of 113 individuals, there were 10 rs7786079-D7S820 haplotypes, which is an increase of four compared to the six alleles of D7S820. The MAF of rs7786079 was 0.079. While the MAF of rs58390469 was 0.4599, there were 11 rs58390469-D2S441 haplotypes, which is an increase of only three compared to the eight alleles of D2S441. The incorporation of SNPs linked to STRs increased the forensic performance. The cumulative match probability (CMP) and the cumulative power of discrimination (CPD) of the 18 SNP-STRs were 2.87 × 10 −24 and >0.999999999999999, respectively, while the CMP and CPD of the 18 STRs were 4.56 × 10 −22 and >0.999999999999999, respectively. In the HWE test, p values for three of the SNP-STR loci were less than 0.05, which may have been caused by the selection of samples from a relatively small group.
Given that the SNP-STR loci in this study were screened only with commonly used STRs, their average I value was lower than that of the DIP-STR loci selected from across the genome (0.33) . When the panel of SNP-STR markers was increased to 40 loci, it resulted in at least 15 informative markers with more than 95% probability, which significantly raised the probability of resolving a mixture and provides more compelling evidence for personal identification. Based on the average LR in Table 8, we can infer that if there are 40 SNP-STR markers, we would have a >95% probability of generating a LR value of 4.98 × 10 6 . Further, if there are 50 SNP-STR markers, we would have a >98% probability of generating a LR value of 4.27 × 10 6 . Therefore, more SNP-STR loci is imperative to increasing the individual discrimination ability of the SNP-STR panel. A combination of SNP-STR and DIP-STR markers might be interesting for future studies.
Y-STR is commonly used in the analysis of forensic mixtures, especially in male-female mixtures from sexual assault cases. Given paternal inheritance, Y-STR haplotypes can only be used to exclude unrelated males and cannot be used for personal identification (Gusmão et al., 2006). Some studies have reported the use of massively parallel sequencing, whereby both STR core sequence and flanking sequence can be obtained at the same time. However, studies have also found that the presence of STR stutters in the analysis of high-proportion DNA mixtures makes it difficult to distinguish the allelic sequence of minor components from the overall sequencing results (Parson et al., 2016;van der Gaag et al., 2016;Jäger et al., 2017). Our SNP-STR analysis using the ARMS technique for simulated high proportion mixtures did not encounter the same difficulty.
The successful detection of SNP-STR alleles in unbalanced two-person mixtures depends on how the SNP genotypes of the mixture contributors compare to each other. Distinct allelic configurations are observed for each marker in practice. The evaluation processes may become tedious if this needs to be done manually. The use of the Bayesian model could make the process easier. It can also help to make evaluative procedures less prone to possible errors. OOBN, developed for DIP-STR analysis by Cereda et al., has the advantage of being able to deal with cases where the genetic information of further individuals (other than the suspect) needs to be considered (typically when the suspect's genotype is not available) (Cereda et al., 2014;Oldoni et al., 2017). However, neither can handle situations where extra minor alleles are observed, which suggests the presence of extra unknown contributors in the mixture. Future work should be conducted to address these gaps for both DIP-STRs and SNP-STRs.
We also report our first experience of the use of SNP-STR markers in casework. As expected, the highest average LR belonged to informative genotype 1. However, the locus (rs2070018-FGA), which belonged to informative genotype 2, had the highest LR (30.525). This is understandable because the allele frequency of C22.2 of rs2070018-FGA (0.018) is much smaller than those of A11 and A14 of rs58390469-D2S441 (0.199 and 0.177, respectively). An LR > 1 could be obtained even when no alleles were detected for the trace sample when the two contributors were both homozygous SNPs of the same kind. The LR value can range from extremely strong support of the prosecution hypothesis (LR ≥ 1 × 10 6 ) to extremely strong support of the defense hypothesis (LR ≤ 1 × 10 −6 ), given the evidence (Martire et al., 2014). In this casework, both the traditional STR and SNP-STR profiling results of the trace sample support the preposition that the victim and the suspect are the two contributors to the mixture. Compared to 2.86 × 10 3 for the traditional STR method, the combined LR reached 7.14 × 10 7 using the SNP-STR method in this casework example. There are three primary limitations in STR analysis. Firstly, minor DNA can be masked by major DNA if they share the same STR allele. In this situation, the SNP-STR method can target minor DNA if it has an SNP allele opposite to that of the major DNA. Secondly, some alleles for minor contributors may be absent using the traditional STR method if it is heavily imbalanced mixtures. Thirdly, it is sometimes difficult to distinguish a true STR allele of minor DNA from the stutter peaks of major DNA, or other noise signals in unbalanced mixtures (Bieber et al., 2016). Our results demonstrated that most allele-specific SNP-STR primers can target minor DNA at an excess of major DNA up to 1:500 with little influence from major DNA's signals. Therefore, SNP-STR markers may have a distinct advantage in unbalanced mixture analysis compared to STR markers.

CONCLUSION
In this study, we developed a novel SNP-STR system based on a CE platform. All 18 SNP-STR loci used in this study were derived from commonly used STRs, and the corresponding STR genotypes could be matched to the existing STR database. The development of the SNP-STR multiplex system simplifies the analytical process. SNP-STR allele-specific primers designed using the ARMS technique can be used to target the minor components in unbalanced binary DNA mixtures with little influence of major DNA's signals, suggesting that these markers may have a distinct advantage in unbalanced mixture analysis compared to STR markers. SNP-STRs can be used as an alternative profiling technique that usefully complements the broad range of approaches available to forensic practitioners.

ETHICS STATEMENT
Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.