Past, Present, and Future of DNA Typing for Analyzing Human and Non-Human Forensic Samples

Forensic DNA analysis has vastly evolved since the first forensic samples were evaluated by restriction fragment length polymorphism (RFLP). Methodologies advanced from gel electrophoresis techniques to capillary electrophoresis and now to next generation sequencing (NGS). Capillary electrophoresis was and still is the standard method used in forensic analysis. However, dependent upon the information needed, there are several different techniques that can be used to type a DNA fragment. Short tandem repeat (STR) fragment analysis, Sanger sequencing, SNapShot, and capillary electrophoresis-single strand conformation polymorphism (CE-SSCP) are a few of the techniques that have been used for the genetic analysis of DNA samples. NGS is the newest and most revolutionary technology and has the potential to be the next standard for genetic analysis. This review briefly encompasses many of the techniques and applications that have been utilized for the analysis of human and nonhuman DNA samples.


INTRODUCTION
Forensic genetics applies genetic tools and scientific methodology to solve criminal and civil litigations (Editorial, 2007). Locard's Exchange Principle states that every contact leaves a trace, making any evidence a key component in forensic analysis. Biological evidence can comprise of cellular material or cell-free DNA from crime scenes, and as technologies improved, genetic methodologies were expanded to include human and non-human forensic analyses. Although these methodologies can be used for any genome, the prevalence of databases and standard guidelines has allowed human DNA typing to become the gold standard. This review will discuss the historical progression of DNA analysis techniques, strengths and limitations, and their possible forensic applications applied to human and non-human genetics.

METHODOLOGIES TO DETECT GENETIC DIFFERENCES IN HUMANS IS THE "GOLD STANDARD"
"DNA Fingerprinting": The Beginning of Human Forensic DNA Typing "DNA fingerprinting" was serendipitously discovered in 1984 (Jeffreys, 2013). What they found propelled DNA "fingerprinting, " or DNA typing, to the forefront in legal cases to become the "gold standard" for forensic genetics in a court of law. Jeffreys first used restriction enzymes to fragment DNA, a method in which restriction endonucleases (RE) enzymes fragment the genomic DNA, producing restriction fragment length polymorphisms (RFLP) patterns. Since each RE recognizes specific DNA sequences to enzymatically cut the DNA, then inherent differences between gene sequences, due to evolutionary changes, will produce different fragment lengths. If the enzyme site is present in one individual but has changed in a different individual, the fragment lengths, once separated and visualized, will differ. While this technique was useful for some studies, Jeffreys did not find it useful for his particular genetic studies. Subsequently when working with the myoglobin gene in seals, he discovered that a short section of that gene -a minisatellitewas conserved and when isolated and cloned could be used to detect inherited genetic lineages as well as individualize a subject. Fragment length separation by electrophoresis, followed by transfer to Southern blot membranes, hybridized with a specific or non-specific complementary isotopic DNA probe, allowed for DNA fragments visualization (Jeffreys et al., 1985b). Upon careful analysis, Jeffreys determined that the fragments represented different combinations of DNA repetitive elements, unique to each individual, and could be used to better identify individuals or kinship lineages (Jeffreys et al., 1985b). Jeffreys' technology was used in several subsequent paternity, immigration, and forensic genetics cases (Gill et al., 1985;Jeffreys et al., 1985a;Evans, 2007). This was just the beginning of a whole new era in DNA typing.

Restriction Fragment Length Polymorphism (RFLP) Analysis: The Past
After Jeffreys' discoveries, many DNA analyses methods involving electrophoretic fragment separation were discovered. Many were based on RFLP principles (Botstein et al., 1980), e.g., amplified fragment length polymorphism (AFLP) (Vos et al., 1995), and terminal restriction fragment length polymorphism (TRFLP) (Liu et al., 1997). Others like length heterogeneitypolymerase chain reaction (LH-PCR) (Suzuki et al., 1998) were based on intrinsic insertions and deletions of bases within specific genetic markers. Sanger sequencing (Sanger and Coulson, 1975), and single-strand conformational polymorphism (SSCP) analysis (Orita et al., 1989), while separated by electrophoresis, are theoretically based on single base sequence changes rather than insertions, deletions or RE site differences. While Jeffrey's DNA fingerprinting method provided a very high power of discrimination, the main limitations were it was very timeconsuming and required at least 10-25 ng of DNA to be successful (Wyman and White, 1980). With these limitations, RFLP was not always feasible for forensic cases.

Short Tandem Repeat (STR) Analysis: The Present
The polymerase chain reaction (PCR) was discovered by Kary Mullis in 1985 and helped transform all DNA analyses (Mullis et al., 1986). The current standard for human DNA typing is short tandem repeat (STR) analysis (McCord et al., 2019). This method amplifies highly polymorphic, repetitive DNA regions by PCR and separates them by amplicon length using capillary electrophoresis. These inheritable markers are a series of 2-7 bases tandemly repeated at a specific locus, often in non-coding genetic regions. Forensic STRs are commonly tetranucleotide repeats (Goodwin et al., 2011), chosen because of their technical robustness and high variation among individuals (Kim et al., 2015). The combined DNA index system (CODIS) uses 20 core STR loci, expanded in 2017, and several commercial kits are available that contain these STRs (Oostdik et al., 2014;Ludeman et al., 2018). After amplification, different fluorochromes on each primer set allow for visualization of STRs after deconvolution, creating a STR profile consisting of a combination of genotypes (Gill et al., 2015). This method has become the gold standard for human forensics. Its greatest strength is the standardization of loci used by all laboratories and an extremely large searchable database of genetic profiles. However, some limitations and challenges are faced when dealing with highly degraded or low template DNA samples. To overcome these technical challenges, standardized mini-STR kits have been developed which use shorter versions of the core STRs and can be used in the same manner for forensic cases (Butler et al., 2007;Constantinescu et al., 2012). Keep in mind, DNA typing of humans -a single species -is the gold standard because of (a) the concerted scientific effort to standardize loci to analyze, (b) the development of commercial kits that can produce the same results regardless of instrumentation or laboratory performing the work, (c) a compatible and very large database that provides allelic frequencies for all sub-populations of humans, (d) standardized statistical methods used to report the results and (e) many court cases that have accepted human DNA typing evidence in a court of law -setting the precedent for future cases to use DNA typing results.

Amplified Fragment Length Polymorphism (AFLP) Analysis
It was not long before scientists realized that non-human DNA could provide informative genetic evidence in forensic cases. Applications include bioterrorism, wildlife crimes, human identification through skin microorganisms, and so much more (Arenas et al., 2017). Since large quantities of biological materials are frequently not found at crime scenes, successful RFLP analyses were unlikely. Combining restriction enzymes and PCR technology, a process known as AFLP analysis (Vos et al., 1995), became a method for DNA fingerprinting using minute amounts of unknown sourced DNA. REs digest genomic DNA, then ligation of a constructed adapter sequence to the ends of all fragments allows the annealing of primers designed to recognize the adaptor sequences. Subsequent amplification generates many amplicons ranging in length when separated and visualized in an electropherogram or on a gel (Vos et al., 1995;Butler, 2012). AFLP markers for plant forensic DNA typing have been used because it provides high discrimination, requires only small amounts of DNA and the method is reproducible, all forensically important characteristics (Datwyler and Weiblen, 2006). For example, since most cannabis is clonally propagated, subsequent generations will have identical genetic profiles as seen with AFLP (Miller Coyle et al., 2003), providing useful intelligence links back to the source population. But there are significant variation between cultivars and within populations, so not having a standard database representing the species' diversity for statistical comparisons greatly limits the method's applicability. Another forensic example of its use is differentiating between marijuana and hemp, two morphologically and genetically similar plants, one an illicit drug while the other is not. In this study, three populations of hemp and one population of marijuana were analyzed with AFLP producing 18 bands that were specific to hemp samples. Additionally, 51.9% of molecular variance occurred within populations indicating these polymorphisms were useful for forensic individualization (Datwyler and Weiblen, 2006).

Terminal Restriction Fragment Length Polymorphism (TRFLP) Analysis
As a result of the anthrax letter attacks of 2001, microbial forensics came to the forefront (Schmedes et al., 2016), a discipline that combines multiple scientific specialtiesmicrobiology, genetics, forensic science, and analytical chemistry. One method used to compare microbial communities is TRFLP (Liu et al., 1997;Osborn et al., 2000;Butler, 2012). With this method, the DNA is amplified using "universal, " highly conserved primer sequences shared across all organisms of interest, i.e., the 16S rRNA genes in bacteria and Archaea, and then uses REs to fragment the PCR products (Table 1). Separated by capillary electrophoresis, only the fluorescently tagged terminal restricted fragments are visualized (Mrkonjic Fuka et al., 2007), reducing the profile complexity and providing high discrimination. TRFLP has been used to characterize complex microbial communities for forensic applications by linking the similarity of the amplicon patterns generated from the intrinsic soil communities to the evidence from a crime scene (Meyers and Foran, 2008;Habtom et al., 2017). This method does provide a distinct pattern reflective of the microbial community, useful for forensic genetics but the method does not provide any sequence information. Another limitation is no standardization of which primer pairs or REs are used, making direct comparisons between studies difficult. This lack of standardization also hinders the development of a database for species identification. Additionally, the method is time-consuming due to the additional step of restriction digestion and the possibility of incomplete enzymatic digestion can complicate the interpretation of results (Osborn et al., 2000;Moreno et al., 2006).

Length Heterogeneity-Polymerase Chain Reaction (LH-PCR)
Another methodology has been used to characterize microbial communities is length heterogeneity-polymerase chain reaction (LH-PCR) (Suzuki et al., 1998). Universal primers complementary to highly conserved domains within genomes are used to amplify hypervariable sequences within specific sequence domains. The 16S/18S rRNA genes, the chloroplast genes or Internal Transcribed Spacer (ITS) regions are commonly used. This technique is based on the natural sequence length variation due to insertions and deletions of bases that occur within a domain (Moreno et al., 2006). It has been used to characterize microbial communities for forensic soil applications where a correlation between geographic location and microbial profiles has proven to be more discriminating than elemental soil analysis (Moreno et al., 2006(Moreno et al., , 2011Damaso et al., 2018). With LH-PCR, metagenomic DNA extracted from the soil is amplified using fluorescently labeled universal primers with amplicon peaks within the electropherogram representing the minimum diversity within the community. However, specific sequence information is not known as many peaks of the same size could represent more than one species, thereby masking the community's actual taxonomic diversity. A recent study showed the intrinsic diversity of a microbial mat, masked by LH-PCR, could be further resolved by the inherent sequence differences using capillary electrophoresis-single strand conformational polymorphism (CE-SSCP) analysis (Damaso et al., 2014) and confirmed by sequencing. The advantage of LH-PCR is it is a fast and reproducible method that can correlate geographical areas to microbial patterns with bioinformatics (Damaso et al., 2018); but a soil database would need to be developed to be useful beyond specific geographical areas.

Sanger Sequencing and Single Nucleotide Polymorphism (SNP) Variation
The basis of genomic differentiation is the intrinsic order of base pairs within a region that can be evaluated by sequencing. Sanger sequencing has been the gold standard since the 1970s (Sanger and Coulson, 1975). Sanger sequencing was termed the gold standard because of the ability for single base pair resolution allowing for full sequence information to be determined. Robust and extensive databases are also readily available for comparison, i.e., GenBank, to identify an organism. However, it does have some limitations such as the short length (<500-700 bp) and it cannot sequence mixtures of organisms, for example, without cloning, so it would not be useful for sequencing complex microbial communities without intense time, effort and cost.
Other approaches use the ability to identify intrinsic single base sequence variation using single nucleotide polymorphisms (SNPs) within four forensically relevant SNP classes: identitytesting, ancestry informative, phenotype informative, and lineage informative. SNPs are particularly useful when typing degraded DNA or increasing the amount of genetic information retrieved from a sample (Budowle and van Daal, 2008; -Massive data output that may be challenging to analyze -Analysis algorithms not standardized -Difficult with some technologies to analyze metagenomes to species level Goodwin et al., 2011). SNaPshot TM is a commercially available SNP kit that can identify known SNPs using single base extension (SBE) technology (Daniel et al., 2015;Fondevila et al., 2017). Wildlife forensics has used SNaPshot TM to identify endangered or trafficked species that are illegally poached to support criminal prosecutions. Elephant species identification from ivory and ivory products (Kitpipit et al., 2017) or differentiating wolf species from dog subspecies (Jiang et al., 2020) are both examples of SNaPshot TM assays developed for wildlife forensics. By using species-specific SNPs, the samples could be identified. But yet again, the limitation becomes the need for species-specific reference databases and the monumental task of developing a robust database for each species. Human SNPs databases with allele frequencies, as seen in dbSNP, however, are available making their forensic application more feasible in some cases.

Next-Generation Sequencing: The Present
Massively parallel sequencing (MPS) or next-generation sequencing (NGS) allows for mixtures of genomes of any species to be sequenced in one analysis (Ansorge, 2009). This technology can sequence thousands of genomic regions simultaneously, allowing for whole-genome, metagenomic sequencing or targeted amplicon sequencing (Gettings et al., 2016). Various NGS technologies are available each using slightly different technologies to sequence DNA (Heather and Chain, 2016). Verogen has developed kits explicitly for human forensic genomics using Illumina's MiSeq FGx system (Guo et al., 2017;Moreno et al., 2018). The FBI recently approved DNA profiles generated by Verogen forensic technology to be uploaded into the National DNA Index System (NDIS) (SWGDAM, 2019), making it the first NGS technology approved for NDIS. Short tandem repeat mixture deconvolution, degraded, low template samples, and even microbial community samples are just a few of the potential NGS applications for forensic genomics and metagenomics (Borsting and Morling, 2015). In human STR analyses, the greatest challenge is mixture deconvolution. NGS technology presents an increased power of discrimination of STR alleles using the intrinsic SNPs genetic microhaplotypesa combination of 2-4 closely linked SNPs within an allele (Kidd et al., 2014;Pang et al., 2020). However, the acceptance of analyses programs to deconvolve mixtures has not been standardized to the same level as it has for STRs.
Microbes are the first responders to changes in any environment because they are rapidly affected by the availability of nutrients and their intrinsic habitats. This makes them excellent indicators for studies investigating postmortem interval (PMI) or as an indicator of soil geographical provenance (Giampaoli et al., 2014;Finley et al., 2015). In decaying organisms, shifts in epinecrotic communities or the thanatomicrobiome are becoming increasingly critical components in investigating PMI (Javan et al., 2016). Sequencing of the thanatomicrobiome revealed the Clostridium spp.
varied during different stages human decomposition, the "Postmortem Clostridium Effect" (PCE), providing a time signature of the thanatomicrobiome, which could only have been uncovered through NGS (Javan et al., 2017). However, the lack of consensus in analyses techniques must be addressed before NGS methodologies can be introduced into the justice system (Table 1).

FUTURE DIRECTIONS AND CONCLUDING REMARKS
Forensic DNA typing has progressed quickly within a short timeframe (Figure 1), which can be attributed to the many advancements in molecular biology technologies. As these techniques advance, forensic scientists will analyze more atypical forms of evidence to answer questions deemed unresolvable with traditional DNA analyses. For example, epigenetics and DNA methylation markers have been proposed to estimate age, determine the tissue type, and even differentiate between monozygotic twins (Vidaki and Kayser, 2018). However, since epigenetic patterns are also influenced by environmental factors, they can be dynamic, and a number of confounding factors have the potential to affect predictions and must be taken into account when preparing prediction models (i.e., age estimation). Additionally, phenotype informative SNPs across the genome can infer physical characteristics like eye, hair, and skin color, even age, from an unknown source of DNA retrieved from a crime scene. But this technology could pose an "implicit bias" toward minorities, especially in "societies where racism and xenophobia are now on the rise" (Schneider et al., 2019) if not ethically and judicially implemented. With the increased sensitivity of NGS, low biomass samples from environmental DNA (eDNA) -DNA from soil, water, air -can complement and enhance intelligence gathering or provenance in criminal cases. Pollen and dust are two types of eDNA recently explored for their future forensic potential (Alotaibi et al., 2020;Young and Linacre, 2021). However, if used in criminal investigations where the eDNA Frontiers in Ecology and Evolution | www.frontiersin.org collected has had interaction with other environments, there must be some protocol or quality control established to account for variability that is likely to occur. This makes the prudent validation of this type of DNA analysis, essential. Limitations also arise due to lack of a database for comparison of samples and statistical analyses to evaluate the strength of a match like in the analysis of human STR profiles.
DNA has long been the gold standard in human forensic analysis because of the standardization of DNA markers, databases and statistical analyses. It has laid the foundation for these promising new technologies that will significantly enhance intelligence gathering and species identification -human and non-human -in forensic cases. In order for these methodologies to be useful in criminal investigations, they must adhere to the legal standards such as the Frye or Daubert Standards which determines if an expert testimony or evidence is admissible in court. A method can be deemed acceptable if it follows forensic guidelines set by organizations such as NIST's Organization Scientific Area Committees (OSAC), Society for Wildlife Forensic Sciences (SWFS), Scientific Working Group on DNA Analysis Methods (SWGDAM), and the International Society for Forensic Genetics (ISFG)  just to name a few. These committees provide the guidelines for validation, interpretation, and quality assurance, all necessary components for DNA analysis. The US Fish and Wildlife forensic laboratory has standardized protocols for crimes against federally endangered or threatened species 1 . However, the more common limiting factors in the development of standard guidelines of non-human forensic genetic analyses across different state laboratories are the lack of consensus in methodologies, supporting allelic databases and standardized statistical analyses. Addressing those issues could lay the foundation for non-human analyses to be on par with human analyses.

AUTHOR CONTRIBUTIONS
DJ designed and wrote the manuscript. DM edited and contributed to the writing of the manuscript. Both authors contributed to the article and approved the submitted version.