<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Ecol. Evol.</journal-id>
<journal-title>Frontiers in Ecology and Evolution</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Ecol. Evol.</abbrev-journal-title>
<issn pub-type="epub">2296-701X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fevo.2020.00105</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Ecology and Evolution</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Assessing DNA Sequence Alignment Methods for Characterizing Ancient Genomes and Methylomes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Poullet</surname> <given-names>Marine</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/858214/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Orlando</surname> <given-names>Ludovic</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/706808/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Laboratoire d&#x2019;Anthropobiologie et d&#x2019;Imagerie de Synth&#x00E8;se, CNRS UMR 5288, Facult&#x00E9; de M&#x00E9;decine de Purpan</institution>, <addr-line>Toulouse</addr-line>, <country>France</country></aff>
<aff id="aff2"><sup>2</sup><institution>GLOBE Institute, University of Copenhagen</institution>, <addr-line>Copenhagen</addr-line>, <country>Denmark</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Michael Knapp, University of Otago, New Zealand</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Kieren James Mitchell, University of Adelaide, Australia; Katharina Dulias, University of York, United Kingdom; Peter D. Heintzman, UiT The Arctic University of Norway, Norway</p></fn>
<corresp id="c001">&#x002A;Correspondence: Ludovic Orlando, <email>ludovic.orlando@univ-tlse3.fr</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Paleoecology, a section of the journal Frontiers in Ecology and Evolution</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>06</day>
<month>05</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="collection">
<year>2020</year>
</pub-date>
<volume>8</volume>
<elocation-id>105</elocation-id>
<history>
<date date-type="received">
<day>28</day>
<month>11</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>03</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2020 Poullet and Orlando.</copyright-statement>
<copyright-year>2020</copyright-year>
<copyright-holder>Poullet and Orlando</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Applying high-throughput DNA sequencing technologies to the ancient DNA molecules preserved in subfossil material can provide genetic information from past individuals, populations, and communities at the genomic scale. The combination of dedicated statistical techniques and specific molecular tools aimed at reducing the impact of post-mortem DNA damage can also help recover epigenetic data from ancient individuals. However, the capacity of different sequence aligners to identify ultrashort and deaminated ancient DNA templates and their impact on the characterization of ancient methylomes remain overlooked. In this study, we use both simulated and real ancient DNA sequence data to benchmark the performance of the read alignment tools most commonly used in ancient DNA research. We identify a read alignment strategy making use of the Bowtie2 aligner that substantially reduce computational times but shows increased sensitivity relative to previous recommendations based on the BWA aligner. This strategy significantly improves the genome coverage especially when DNA templates are shorter than 90 bp, as is typically the case for ancient DNA. It also impacts on ancient DNA methylation estimates as it maximizes coverage improvement within CpG dinucleotide contexts, which hold the vast majority of DNA methylation marks in mammals. Our work contributes to improve the accuracy of DNA methylation maps and to maximize the amount of recoverable genetic information from archeological and subfossil material. As the molecular complexity of ancient DNA libraries is generally limited, the mapping strategy recommended here is essential to limit both sequencing costs and sample destruction.</p>
</abstract>
<kwd-group>
<kwd>ancient DNA</kwd>
<kwd>DNA methylation</kwd>
<kwd>DNA damage</kwd>
<kwd>alignment</kwd>
<kwd>mapping</kwd>
<kwd>coverage</kwd>
<kwd>genome</kwd>
<kwd>methylome</kwd>
</kwd-group>
<contract-sponsor id="cn001">Universit&#x00E9; de Toulouse<named-content content-type="fundref-id">10.13039/501100004718</named-content></contract-sponsor>
<contract-sponsor id="cn002">Centre National de la Recherche Scientifique<named-content content-type="fundref-id">10.13039/501100004794</named-content></contract-sponsor>
<contract-sponsor id="cn003">H2020 European Research Council<named-content content-type="fundref-id">10.13039/100010663</named-content></contract-sponsor>
<contract-sponsor id="cn004">Villum Fonden<named-content content-type="fundref-id">10.13039/100008398</named-content></contract-sponsor>
<counts>
<fig-count count="5"/>
<table-count count="1"/>
<equation-count count="0"/>
<ref-count count="54"/>
<page-count count="13"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>The first genome from an ancient human individual was sequenced in 2010 (<xref ref-type="bibr" rid="B34">Rasmussen et al., 2010</xref>) and was immediately followed by the genome sequencing of a Neanderthal (<xref ref-type="bibr" rid="B12">Green et al., 2010</xref>) and Denisovan (<xref ref-type="bibr" rid="B35">Reich et al., 2010</xref>) individual, two extinct archaic hominins. Since then, hundreds of ancient genomes have been characterized across many branches of the tree of life, including humans, horses, dogs, pigs, cattle, goats, wooly mammoths, but also many human pathogens and crops such as maize, sorghum, and barley (see <xref ref-type="bibr" rid="B27">Marciniak and Perry, 2017</xref> and <xref ref-type="bibr" rid="B3">Brunson and Reich, 2019</xref> for reviews). Ancient genome time series have made it possible to chart migration, admixture, and selection through space and time at unprecedented resolution. They have provided many opportunities to revisit evolutionary scenarios developed from patterns of cultural variation among archeological sites (e.g., the spread of steppe-related ancestry during the Eneolithic and early Bronze Age, see <xref ref-type="bibr" rid="B1">Allentoft et al., 2015</xref>; <xref ref-type="bibr" rid="B14">Haak et al., 2015</xref>; <xref ref-type="bibr" rid="B6">Damgaard et al., 2018a</xref>; <xref ref-type="bibr" rid="B30">Narasimhan et al., 2019</xref>; <xref ref-type="bibr" rid="B53">Wang et al., 2019</xref>) and from patterns of genetic variation in present-day populations (e.g., the temporal and geographic rise of lactose tolerance in western Eurasia, <xref ref-type="bibr" rid="B29">Mathieson et al., 2015</xref>; <xref ref-type="bibr" rid="B46">S&#x00E9;gurel and Bon, 2017</xref>).</p>
<p>The variation present in ancient DNA sequences does not only inform us about the genetic affinities of past individuals, populations, and species. It can also provide insights into ancient epigenetic landscapes, which play a crucial role in the regulation of gene expression (<xref ref-type="bibr" rid="B21">Lea et al., 2018</xref>) in response to infection (<xref ref-type="bibr" rid="B49">Smith et al., 2014</xref>; <xref ref-type="bibr" rid="B31">Pacis et al., 2015</xref>) as well as social (<xref ref-type="bibr" rid="B20">Laubach et al., 2019</xref>; <xref ref-type="bibr" rid="B40">Santos et al., 2019</xref>; <xref ref-type="bibr" rid="B41">Sanz et al., 2019</xref>; <xref ref-type="bibr" rid="B51">Snyder-Mackler et al., 2019</xref>) and environmental (<xref ref-type="bibr" rid="B8">Fagny et al., 2015</xref>) cues. It can, thus, help predict individual phenotypes in the past (see <xref ref-type="bibr" rid="B32">Pedersen et al., 2014</xref> and <xref ref-type="bibr" rid="B17">Hangh&#x00F8;j et al., 2016</xref> for age predictions on ancient individuals, or <xref ref-type="bibr" rid="B11">Gokhman et al., 2019</xref> for morphological predictions).</p>
<p>Although methods have been developed to infer nucleosome maps in ancient tissues (<xref ref-type="bibr" rid="B32">Pedersen et al., 2014</xref>; <xref ref-type="bibr" rid="B17">Hangh&#x00F8;j et al., 2016</xref>), most of ancient epigenetic work thus far has focused on detecting DNA methylation within CpG dinucleotide (CpG) contexts. While molecular tools such as bisulfite sequencing (<xref ref-type="bibr" rid="B26">Llamas et al., 2012</xref>; <xref ref-type="bibr" rid="B50">Smith et al., 2015</xref>) or immunoprecipitation (<xref ref-type="bibr" rid="B45">Seguin-Orlando et al., 2015</xref>) have been used, genome-wide DNA methylation maps have been mostly produced through statistical inference leveraging the differential sequence footprint of post-mortem DNA damage at methylated and unmethylated sites, in particular at CpGs (see <xref ref-type="bibr" rid="B15">Hangh&#x00F8;j and Orlando, 2018</xref> for a review). When molecular tools are used to prevent the sequencing of those unmethylated CpGs that have been degraded into UpGs, ancient methylated CpGs can indeed be revealed through CpG&#x2192;TpG mis-incorporations in the sequence data (see &#x201C;Materials and Methods&#x201D;). Most recent methodologies have been proposed to mitigate the impact of evolutionary divergence and/or sequence variation at CpG sites on the calculation of DNA methylation scores (<xref ref-type="bibr" rid="B16">Hangh&#x00F8;j et al., 2019</xref>).</p>
<p>High-quality DNA sequence alignments against a reference genome are essential to make accurate predictions of past genetic and epigenetic variation. Yet, the vast majority of ancient DNA studies make use of read aligner software that were developed for mapping short read sequences produced from rather long DNA molecules extracted from fresh tissues. They are, thus, not optimized for the ultra-short and degraded nature of ancient DNA templates (<xref ref-type="bibr" rid="B5">Dabney et al., 2013</xref>). Several studies have contrasted a range of mapping conditions to identify those most specific and most sensitive (e.g., <xref ref-type="bibr" rid="B43">Schubert et al., 2012</xref>; <xref ref-type="bibr" rid="B4">Cahill et al., 2018</xref>) or to mitigate the extent of reference bias (<xref ref-type="bibr" rid="B13">G&#x00FC;nther and Nettelblad, 2019</xref>; <xref ref-type="bibr" rid="B28">Martiniano et al., 2019</xref>). Yet, the sensitivity and specificity of other read aligners for ancient DNA data, such as Bowtie2 (<xref ref-type="bibr" rid="B19">Langmead and Salzberg, 2012</xref>), as well as the impact of different read alignment strategies on ancient methylation inference, remain untested. However, since the latter relies on patterns of CpG&#x2192;TpG mis-incorporations introduced in the genome sequence data by post-mortem DNA damage, the read-to-reference edit distance is expected to be increased at methylated sites. This may affect the alignment sensitivity at such sites and, in turn, impact the accuracy of DNA methylation inference for ancient individuals. It is, thus, essential to investigate the possible sensitivity of read alignment methods at CpG dinucleotides so as to not underestimate DNA methylation levels along the genome and to accurately identify differentially methylated regions between individuals showing different levels of post-mortem DNA damage.</p>
<p>In this study, we assess the performance of 11 read alignment strategies for mapping ancient DNA sequence data against reference genomes and their impact on the inference of ancient DNA methylation. Our main purpose is not to carry out an exhaustive investigation about the impact of mapping parameters on an entire array of sequence data reflecting various post-mortem DNA decay conditions (for such studies, please refer to, e.g., <xref ref-type="bibr" rid="B43">Schubert et al., 2012</xref>; <xref ref-type="bibr" rid="B4">Cahill et al., 2018</xref>; <xref ref-type="bibr" rid="B37">Renaud et al., 2018</xref>). We instead focus on identifying those parameters and factors with potential impact on DNA methylation, using publicly available sequence data carefully selected from the literature to have been generated both in the absence and presence of USER treatment of the same ancient DNA extracts. The latter currently provide the best source of information to estimate CpG methylation level from patterns of CpG&#x2192;TpG mis-incorporations (see <xref ref-type="bibr" rid="B15">Hangh&#x00F8;j and Orlando, 2018</xref> for a review). Overall, we uncover that the end-to-end alignment mode of Bowtie2 shows better performance than all other commonly used alternatives. Simulated and real ancient human DNA sequence data reveal that the coverage can be increased by up to 2.1&#x2013;9.4% for a given sequencing effort. The gain in recovered read alignments is particularly important within CpGs and significantly impacts the inference of regional DNA methylation levels. Applying such alignment procedures thus, improves both the quality of the genome and epigenetic data produced from ancient individuals and extinct species.</p>
</sec>
<sec id="S2" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="S2.SS1">
<title>Ancient DNA Sequence Datasets</title>
<p>Previously published raw sequence data from four ancient human individuals were downloaded from the European Nucleotide Archive (<xref ref-type="table" rid="T1">Table 1</xref>). For three of the four ancient humans (SI, SIII, and SIV, <xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref>), Illumina DNA sequences were generated for libraries prepared with and without treatment with the USER enzymatic mix (<xref ref-type="bibr" rid="B38">Rohland et al., 2015</xref>). The USER treatment makes use of a first enzymatic activity, the uracil DNA glycosylase, to eliminate uracil residues (U) accumulated in ancient DNA templates due to post-mortem deamination of cytosine residues (C). This leaves abasic sites as targets for a second enzymatic reaction in which the Endonuclease VIII cleaves the DNA backbone 3&#x2032; of the abasic site. As a result, the fraction of DNA library templates containing U residues is reduced, which limits the number of C&#x2192;T nucleotide mis-incorporations introduced during sequencing (<xref ref-type="bibr" rid="B2">Briggs et al., 2010</xref>). USER treatment is, however, inefficient for those C residues that were methylated but deaminated post-mortem, as the uracil DNA glycosylase shows no activity on the resulting thymine (T) residues. Therefore, C&#x2192;T nucleotide mis-incorporations are mostly restricted to methylated loci in the presence of USER treatment (<xref ref-type="bibr" rid="B32">Pedersen et al., 2014</xref>). In these conditions, the read-to-genome edit distance can be expected to be inflated at such sites, which may affect the performance of read alignment software. In the absence of USER treatment, this effect is, however, expected to impact all C residues that were deaminated post-mortem, be methylated or not. Contrasting sequence data generated from raw or USER-treated ancient DNA extracts provided, thus, an opportunity to assess the performance of read alignment software at methylated loci. The sequence data underlying the ancient genomes of Sunghir Upper Paleolithic individuals originate from similar preservation conditions and were generated both in the presence and in the absence of USER treatment. These data thus provided us with an opportunity to assess whether mapping conditions could affect regional methylation prediction.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Sample and sequence information.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Name</td>
<td valign="top" align="left">Experimental conditions</td>
<td valign="top" align="center">Age (years ago)</td>
<td valign="top" align="left">Location</td>
<td valign="top" align="left">Bone</td>
<td valign="top" align="center">Mean fragment length</td>
<td valign="top" align="center">Read pairs</td>
<td valign="top" align="left">Libraries</td>
<td valign="top" align="center">ENA accession number</td>
<td valign="top" align="left">Publication</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Sunghir SI</td>
<td valign="top" align="left">USER+</td>
<td valign="top" align="center">33,875&#x2013;31,770</td>
<td valign="top" align="left">Russia</td>
<td valign="top" align="left">Molar root</td>
<td valign="top" align="center">50.636</td>
<td valign="top" align="center">15,069,820</td>
<td valign="top" align="left">SI_388_USER_14_CGTATA</td>
<td valign="top" align="center">PRJEB22592</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Sunghir SI</td>
<td valign="top" align="left">USER&#x2212;</td>
<td valign="top" align="center">33,875&#x2013;31,770</td>
<td valign="top" align="left">Russia</td>
<td valign="top" align="left">Molar root</td>
<td valign="top" align="center">55.162</td>
<td valign="top" align="center">20,168,236</td>
<td valign="top" align="left">SI_388_NOT_USER_13_CTATCA</td>
<td valign="top" align="center">PRJEB22592</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Sunghir SIII</td>
<td valign="top" align="left">USER+</td>
<td valign="top" align="center">35,154&#x2013;33,031</td>
<td valign="top" align="left">Russia</td>
<td valign="top" align="left">Molar root</td>
<td valign="top" align="center">64.873</td>
<td valign="top" align="center">12,419,661</td>
<td valign="top" align="left">SIII_386_USER_39_CGACCT</td>
<td valign="top" align="center">PRJEB22592</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Sunghir SIII</td>
<td valign="top" align="left">USER&#x2212;</td>
<td valign="top" align="center">35,154&#x2013;33,031</td>
<td valign="top" align="left">Russia</td>
<td valign="top" align="left">Molar root</td>
<td valign="top" align="center">70.630</td>
<td valign="top" align="center">13,336,119</td>
<td valign="top" align="left">SIII_386_NOT_USER_2_CGATGT</td>
<td valign="top" align="center">PRJEB22592</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Sunghir SIV</td>
<td valign="top" align="left">USER+</td>
<td valign="top" align="center">34,485&#x2013;33,499</td>
<td valign="top" align="left">Russia</td>
<td valign="top" align="left">Femur</td>
<td valign="top" align="center">60.055</td>
<td valign="top" align="center">12,029,755</td>
<td valign="top" align="left">SIV_392_USER_23_AGCATG</td>
<td valign="top" align="center">PRJEB22592</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Sunghir SIV</td>
<td valign="top" align="left">USER&#x2212;</td>
<td valign="top" align="center">34,485&#x2013;33,499</td>
<td valign="top" align="left">Russia</td>
<td valign="top" align="left">Femur</td>
<td valign="top" align="center">64.448</td>
<td valign="top" align="center">10,100,614</td>
<td valign="top" align="left">SIV_392_NOT_USER_10_TAGCTT</td>
<td valign="top" align="center">PRJEB22592</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">NE1</td>
<td valign="top" align="left">USER&#x2212;</td>
<td valign="top" align="center">5,070&#x2013;5,310</td>
<td valign="top" align="left">Hungary</td>
<td valign="top" align="left">Petrous bone</td>
<td valign="top" align="center">69.762</td>
<td valign="top" align="center">63,774,886</td>
<td valign="top" align="left">NE1_SRR1186790</td>
<td valign="top" align="center">PRJNA240906</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B9">Gamba et al., 2014</xref></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>The name, ages, and location of each ancient DNA specimen considered in this study are provided with respect to the original publication reporting the DNA sequence data. The numbers of read sequencing pairs considered in the analyses are provided and represent only a subset of the overall data available for download at the European Nucleotide Archive (ENA). The experimental conditions indicate whether raw ancient DNA extracts where treated (USER+) or not (USER&#x2212;) with the USER enzymatic mix prior to DNA library construction. Ages are calibrated radiocarbon ages. The mean fragment length corresponds to BWA ds.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S2.SS2">
<title>Data Simulation</title>
<p>Quantifying the sensitivity and predicted positive value of DNA alignment software requires the identification of the fraction of reads correctly mapped (true positives), those not correctly mapped (false positives), and those not mapped at all (false negatives). In order to assess those performance statistics, we simulated DNA sequence data using the human (hg19, <xref ref-type="bibr" rid="B52">The Genome Sequencing Consortium, 2001</xref>) reference genome and Gargammel (<xref ref-type="bibr" rid="B36">Renaud et al., 2017</xref>). This software returns DNA sequences of a selected size and can include sequencing errors typical of Illumina DNA sequencing instruments and, optionally, DNA mis-incorporations reflecting post-mortem DNA damage. A total of 3.3 million read pairs were simulated both in the presence and in the absence of ancient DNA damage for an entire size range of DNA templates overlapping typical ancient DNA size distributions. This included 100,000 read pairs for each size increment of one nucleotide within the 25&#x2013;45 bp range, as well as 100,000 read pairs for each size increment of five nucleotides within the 45&#x2013;90 bp range, and finally 100,000 read pairs for each size increment of 10 nucleotides within the 90&#x2013;120 bp range. DNA damage was simulated using the DNA mis-incorporation of sample SIII produced by mapDamage2 (<xref ref-type="bibr" rid="B18">J&#x00F3;nsson et al., 2013</xref>). The alignment file that was used as input for mapDamage2 was generated by Paleomix (version 1.2.13.2, <xref ref-type="bibr" rid="B42">Schubert et al., 2014</xref>) using the same human reference genome as above and the default end-to-end alignment mode of Bowtie2, sensitive.</p>
</sec>
<sec id="S2.SS3">
<title>Read Processing and Alignment</title>
<p>Both simulated and real ancient DNA sequence data were processed using Paleomix. This automated computational pipeline carries out a number of read processing steps, including adapter trimming, pair collapsing, mapping, quality/size filtering, duplicate removal, and local realignment. Mapping was performed using both BWA (<xref ref-type="bibr" rid="B22">Li and Durbin, 2009</xref>) and Bowtie2 (<xref ref-type="bibr" rid="B19">Langmead and Salzberg, 2012</xref>), which represent the two most commonly used read alignment software in ancient DNA research. BWA version 0.7.17 was used in this study, together with two main alignment modes (backtrack and mem). The backtrack algorithm was applied both using seed or disabling seeding with default parameters (-n 0.04), as recommended by <xref ref-type="bibr" rid="B43">Schubert et al. (2012)</xref> for ancient DNA data. Version 2.3.5.1 of the Bowtie2 read mapper was used, applying both the local and end-to-end alignment modes and the four sensitivity options provided (very fast, fast, sensitive, and very sensitive). Combined, this represented a total of 11 read alignment conditions. Read pairs were automatically collapsed as single reads when showing sufficient sequence overlap and the base quality was recalculated according to sequence match at those overlapping positions, following the default procedure implemented in the AdapterRemoval2 software (<xref ref-type="bibr" rid="B44">Schubert et al., 2016</xref>). Reads shorter than 25 bp post-trimming and/or collapsing were disregarded except for BWA mem where reads shorter than 30 bp were disregarded. Computational running times were recorded using the time bash command.</p>
</sec>
<sec id="S2.SS4">
<title>Coverage and DNA Methylation Calculations</title>
<p>Binary Alignment Map (BAM) read alignment files and summary files obtained from Paleomix were processed for a number of analyses. First, average depth-of-coverage was calculated disregarding alignments showing quality scores strictly lower than 30. This corresponded to the estimated endogenous coverage provided in the Paleomix summary file. Second, average depth-of-coverage estimates were calculated at CpG, CpA, CpC, and CpT dinucleotides using the coverage option of Bedtools (<xref ref-type="bibr" rid="B33">Quinlan and Hall, 2010</xref>), conditioning on the bed coordinates of each dinucleotide type present in the human reference genome (-d option). The coordinates were obtained using Seqkit Version 0.3.1.1 (<xref ref-type="bibr" rid="B47">Shen et al., 2016</xref>). Third, we repeated the previous calculations after soft-clipped bases present in the read alignments were masked using the Jvarkit Biostar84452 tool (<xref ref-type="bibr" rid="B24">Lindenbaum, 2015</xref>). All previous analyses were carried out on both the read and simulated DNA sequence data. For simulated data, we also estimated the sensitivity and positive predicted value of each alignment condition. The alignment sensitivity was measured by dividing the number of true-positive alignments by their sum with the number of false-negative alignments [i.e., true positives/(true positives + false negatives] (<xref ref-type="bibr" rid="B44">Schubert et al., 2016</xref>). The alignment positive predictive value was estimated as the fraction of all simulated reads that were correctly mapped [i.e., true positives/(true positives + false positives)] (<xref ref-type="bibr" rid="B44">Schubert et al., 2016</xref>). Reads were considered as true positives if they showed a minimum of 80% of their length overlapping the known genomic coordinates used for simulation. Reads were considered as false negative when not mapping and false positives otherwise. These three categories were identified using python version 2.7.5 and the pysam library (<xref ref-type="bibr" rid="B23">Li et al., 2009</xref>). Additionally, DNA methylation analyses were carried out using the recently developed DamMet package (version 1.0.1, <xref ref-type="bibr" rid="B16">Hangh&#x00F8;j et al., 2019</xref>), in which the fraction of DNA methylation, f, can be estimated for a given genomic region including a pre-selected number of CpG dinucleotides. In this study, we selected 22,845 regions showing a total of 100 CpG dinucleotides in the human reference genome as the amount of sequence data considered was not sufficient to retrieve genuine estimates in regions of smaller sizes (data not shown). The corresponding genomic coordinates were provided to DamMet in the form of a BED coordinate file using the -B option. The f DNA methylation values were directly retrieved for each genomic window from the DamMet output. For consistency, coverage estimates within each window were calculated using the approach described above. All plots were generated using RStudio Version 1.1.463 (<xref ref-type="bibr" rid="B39">RStudio Team, 2016</xref>) and the ggplot2 library (<xref ref-type="bibr" rid="B54">Wickham, 2016</xref>).</p>
</sec>
</sec>
<sec id="S3">
<title>Results</title>
<sec id="S3.SS1">
<title>Overall Alignment Performance</title>
<p>BWA (<xref ref-type="bibr" rid="B22">Li and Durbin, 2009</xref>) represents the most common software for aligning ancient DNA data against a reference genome. Previous work has established that disabling seeding in BWA increased mapping sensitivity for ancient DNA data, owing to the presence of inflated mis-incorporation rates at read ends (<xref ref-type="bibr" rid="B43">Schubert et al., 2012</xref>). Additional work investigated the specificity and sensitivity of Bowtie2 for ancient DNA data (<xref ref-type="bibr" rid="B4">Cahill et al., 2018</xref>). The performance of both aligners has, however, not been benchmarked on ancient DNA data with the specific aim to assess their possible impact on the inference of ancient DNA methylation. We, thus, compared their overall alignment performance on previously published ancient DNA data from four ancient humans, consisting of three Upper Paleolithic individuals excavated at Sunghir (SI, SIII, and SIV) and one Neolithic individual from Hungary (NE1) (<xref ref-type="table" rid="T1">Table 1</xref>). This represented a total of 11 mapping conditions, including 3 for BWA (with/out seeding, and mem) and 8 for Bowtie2 (very fast, fast, sensitive, and very sensitive options for both the local and end-to-end alignment modes). Alignment performance was calculated by normalizing the genome coverage obtained in one mapping condition relative to that obtained when disabling seeding in BWA, after quality filtering and duplicate removal (<xref ref-type="fig" rid="F1">Figure 1</xref>).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Normalized coverage across all 11 mapping conditions investigated (real data). The average depth of coverage was estimated by filtering alignments for minimal mapping quality scores of 30 (MQ &#x2265; 30) and removing PCR duplicates. All coverage estimates are reported relative to those obtained with the BWA read aligner when disabling seeding (BWA ds). The other mapping conditions investigated in BWA correspond to seeding (BWA s) and mem (BWA mem). Two alignment modes (local, l; and end-to-end, e) were tested in Bowtie2, together with four options (fast, f; very fast, vf; sensitive, s; and, very sensitive, vs). The mean fragment length corresponding to BWA ds is indicated in <xref ref-type="table" rid="T1">Table 1</xref>.</p></caption>
<graphic xlink:href="fevo-08-00105-g001.tif"/>
</fig>
<p>We first confirmed previous work reporting reduced BWA performance when seeding, corresponding to a loss of 0.19&#x2013;0.51% coverage across all four ancient DNA sequence datasets investigated in the absence of USER treatment (<xref ref-type="fig" rid="F1">Figure 1</xref>, USER&#x2212;). BWA mem was found to show increased performance in all three Sunghir individuals, in which a gain of 1.66&#x2013;3.63% coverage was obtained. However, the performance was reduced (1.25%) for the NE1 individual. This indicates that the individual features of ancient DNA datasets, which reflect different post-mortem DNA preservation conditions, can have both a positive and a negative impact on the performance of the mem alignment procedure. The same was found for the four sensitivity options (very fast, fast, sensitive, and very sensitive) of the Bowtie2 local alignment mode, in which up to 2.63% coverage could be gained and up to 1.75% could be lost depending on the procedure considered. In these conditions, the very fast sensitivity option was the only one associated with a performance drop in all four samples tested (0.20&#x2013;1.75%). In contrast to what was observed for the local alignment mode, the end-to-end mode in Bowtie2 was consistently found to show reduced performance, representing a loss of 2.93%&#x2013;11.15% coverage.</p>
<p>We next assessed the mapping performance of ancient DNA data generated following USER treatment of raw DNA extracts (USER+). This treatment was developed to reduce the amount of DNA mis-incorporations resulting from post-mortem Cytosine deamination, which represents the most common DNA degradation reaction taking place after death (<xref ref-type="bibr" rid="B2">Briggs et al., 2010</xref>; <xref ref-type="bibr" rid="B5">Dabney et al., 2013</xref>). USER-treated ancient DNA data were available for the three Sunghir individuals but not for the NE1 individual. We found marginal coverage gain (0.01&#x2013;0.11%) when disabling seeding in BWA, and different performance for the mem alignment procedure, in which a fraction of coverage could be gained (0.43%) or lost (1.12%) (<xref ref-type="fig" rid="F1">Figure 1</xref>, USER+). Interestingly, all eight alignment modes tested for Bowtie2 were associated with increased performance, corresponding to a gain of 1.45&#x2013;4.76% coverage relative to what was obtained when disabling seeding in BWA. This is in striking contrast with the reduced performance observed for the end-to-end alignment mode in the absence of USER treatment and indicates that the USER treatment modified the properties of ancient DNA data sufficiently enough to positively impact on the alignment performance.</p>
<p>To further gain insights into the alignment consequences of USER treatment, we simulated ancient DNA sequence data of increasing size (25&#x2013;120 bp) and assessed the fraction of true positives, false positives, and false negatives obtained for each of the 11 alignment procedures tested (<xref ref-type="fig" rid="F2">Figure 2</xref> and <xref ref-type="supplementary-material" rid="FS1">Supplementary Figures S1</xref>, <xref ref-type="supplementary-material" rid="FS1">S2</xref>). We found that the fraction of false-negative alignments was minimal when using the BWA aligner, except for the mem alignment mode and DNA templates of sizes inferior to 35 bp. The end-to-end alignment mode in Bowtie2 also led to virtually no false-negative alignments across all size categories investigated, including for DNA templates of 25&#x2013;26 bp in which a detectable proportion of false-negative alignments was obtained when using the local alignment mode (albeit more limited than that observed in BWA mem for larger sizes, <xref ref-type="fig" rid="F2">Figure 2A</xref>). Applying strict mapping quality thresholds of 30 was found appropriate for eliminating all false-positive alignments obtained in all mapping conditions investigated (<xref ref-type="fig" rid="F2">Figure 2B</xref>). Interestingly, the fraction of true-positive alignments showing mapping quality scores strictly inferior to 30 increased in BWA rather than in Bowtie2 for DNA templates of limited sizes (25&#x2013;70 bp) (<xref ref-type="fig" rid="F2">Figure 2C</xref>). This fraction increased for larger sizes when using the Bowtie2 local alignment mode (70 bp) or the Bowtie2 end-to-end alignment mode (90 bp). This indicates that the mapping quality scores returned by BWA for short size categories such as those generally observed with ancient DNA data are more conservative than those returned by Bowtie2. Moreover, when applying a strict mapping quality threshold of 30 (as commonly practiced, e.g., <xref ref-type="bibr" rid="B48">Sikora et al., 2017</xref>), the sensitivity (i.e., the fraction of true positives relative to both true positives and false negatives) of BWA was more limited in the mem alignment mode than when seeding or disabling seeding for short DNA templates. It, however, returned to &#x223C;100% for size categories superior or equal to 40 bp (<xref ref-type="supplementary-material" rid="FS1">Supplementary Figure S3</xref>). Maximal sensitivity was observed in Bowtie2 using the end-to-end alignment mode (<xref ref-type="bibr" rid="B7">Damgaard et al., 2018b</xref>), resulting in a significant loss of true-positive alignments in BWA compared to Bowtie2 (<xref ref-type="supplementary-material" rid="FS1">Supplementary Figure S3</xref>). This effect is reversed for size categories larger than 70 bp (local mode) or 90 bp (end-to-end mode), but this is expected to minimally impact ancient DNA datasets due to the generally extensive DNA fragmentation that takes place post-mortem (<xref ref-type="fig" rid="F2">Figure 2C</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Alignment performance of simulated data. A total of 100,000 reads were simulated for each size category considered in the presence of typical Illumina sequencing errors as well as nucleotide mis-incorporations remaining following USER treatment. MQ refers to the mapping quality scores of the read alignments. <bold>(A)</bold> Fractions of true-positive, false-positive, and false-negative alignments. <bold>(B)</bold> Mapping quality scores of false-positive alignments. <bold>(C)</bold> Mapping quality scores of true-positive alignments.</p></caption>
<graphic xlink:href="fevo-08-00105-g002.tif"/>
</fig>
<p>We next tested this prediction by measuring the overall alignment performance of the 11 procedures investigated by calculating the total coverage achieved after applying a strict mapping quality threshold of 30 and removing PCR duplicates (<xref ref-type="fig" rid="F3">Figure 3</xref>). We confirmed that Bowtie2 showed an increased performance relative to BWA for DNA templates of size inferior to 70 bp when running with the local mode and for templates of size inferior to 90 when running the end-to-end mode.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Normalized coverage across all 11 mapping conditions investigated (simulated data). The average depth of coverage was estimated by filtering alignments for minimal mapping quality scores of 30 (MQ &#x2265; 30) and removing PCR duplicates. All coverage estimates are reported relative to those obtained with the BWA read aligner when disabling seeding (BWA ds). The other mapping conditions investigated in BWA correspond to seeding (BWA s) and mem (BWA mem). Two alignment modes (local, l and end-to-end, e) were tested in Bowtie2, together with four options (fast, f; very fast, vf; sensitive, s, and; very sensitive, vs). A total of 100,000 reads were simulated for each size category considered in the presence of typical Illumina sequencing errors as well as nucleotide mis-incorporations remaining following USER treatment. Each panel provides the coverage for each size category (25, 40, 50, 60, 70, 75, 80, 85, 90, and 100 bp).</p></caption>
<graphic xlink:href="fevo-08-00105-g003.tif"/>
</fig>
<p>Altogether, our USER-treated read simulations revealed that across all size categories. The sensitivity of the local alignment mode was reduced for DNA templates of size strictly inferior to 38 bp, but was generally larger than that observed with BWA mem. The quality scores returned by BWA in the short size range were found to be conservative, leading to the loss of a significant fraction of true positives (9.4&#x2013;10.0%) when applying strict quality thresholds (<xref ref-type="fig" rid="F2">Figure 2C</xref>).</p>
</sec>
<sec id="S3.SS2">
<title>Alignment Performance at CpG Sites and DNA Methylation Inference</title>
<p>The most commonly used strategy available for estimating ancient methylation maps leverages patterns of C&#x2192;T mis-incorporations at CpG dinucleotide sites as identified from BAM alignment files providing ancient DNA sequence alignment against a reference genome. We next investigated if the different mapping conditions investigated above showed different performance at CpG dinucleotide sites and could lead to different estimates of ancient DNA methylation levels. We first calculated the coverage achieved at each CpN dinucleotide context (i.e., CpA, CpC, CpG, and CpT) when applying the 11 mapping conditions to the sequencing data available for the three Sunghir individuals (<xref ref-type="fig" rid="F4">Figure 4</xref> and <xref ref-type="supplementary-material" rid="FS1">Supplementary Figure S4</xref>). This revealed results largely consistent with those obtained when measuring coverage genome-wide, in which the Bowtie2 end-to-end mode showed the poorest performance when considering data generated in the absence of USER treatment (<xref ref-type="fig" rid="F4">Figure 4</xref>, USER&#x2212;). The performance drop was more pronounced at CpG dinucleotides. This is most likely due to the faster cytosine deamination rates reported at such sites when methylated (<xref ref-type="bibr" rid="B45">Seguin-Orlando et al., 2015</xref>; <xref ref-type="bibr" rid="B50">Smith et al., 2015</xref>), which increases the read-to-reference edit distance and, thus, limits the alignment sensitivity.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Average depth of coverage in four dinucleotide contexts (real data). <bold>(A)</bold> Average depth of coverage when real data are generated in the absence of USER treatment. <bold>(B)</bold> Average depth of coverage when real data are generated following USER treatment. The average depth of coverage was estimated by filtering alignments for minimal mapping quality scores of 30 (MQ &#x2265; 30) and removing PCR duplicates. Coverage values are calculated in the dinucleotide sequence context most affected by DNA methylation (CpG), as well as the three other dinucleotides potentially affected by post-mortem cytosine deamination at the same position (i.e., CpA, CpC, and CpT). The differences observed are not due to soft-clipped bases as the values returned in the presence or not of soft-clipping masking are identical (<xref ref-type="supplementary-material" rid="FS1">Supplementary Figure S4</xref>).</p></caption>
<graphic xlink:href="fevo-08-00105-g004.tif"/>
</fig>
<p>In striking contrast, Bowtie2 showed increased performance for all eight alignment conditions investigated when considering data generated following USER treatment (<xref ref-type="fig" rid="F4">Figure 4</xref>, USER+). The performance gain was generally found to be especially pronounced within CpG dinucleotide contexts. This indicates that the USER treatment restored a fraction of reads that could not be previously aligned by reducing the read-to-reference edit distance. Since USER treatment is inefficient on those CpG dinucleotides that are methylated (<xref ref-type="bibr" rid="B2">Briggs et al., 2010</xref>; <xref ref-type="bibr" rid="B32">Pedersen et al., 2014</xref>; <xref ref-type="bibr" rid="B17">Hangh&#x00F8;j et al., 2016</xref>), we deduce that the Bowtie2 alignment conditions tested are more prone to result in gain of coverage at unmethylated CpG dinucleotides, which could have important consequences when deriving estimates of ancient DNA methylation levels.</p>
<p>The sensitive option of the end-to-end Bowtie2 alignment mode was found to show favorable running performance speed (<xref ref-type="supplementary-material" rid="FS1">Supplementary Figure S5</xref>). It also returned maximal sensitivity using simulated sequence data (<xref ref-type="supplementary-material" rid="FS1">Supplementary Figure S3</xref>) and maximal coverage gain at CpG dinucleotides on ancient DNA sequence data generated following USER treatment (<xref ref-type="fig" rid="F4">Figure 4</xref>). We, thus, next compared the impact of this mapping condition on DNA methylation estimates relative to that used in all previous ancient DNA data work and consisting of disabling seeding in BWA (<xref ref-type="bibr" rid="B10">Gokhman et al., 2014</xref>, <xref ref-type="bibr" rid="B11">2019</xref>; <xref ref-type="bibr" rid="B32">Pedersen et al., 2014</xref>; <xref ref-type="bibr" rid="B17">Hangh&#x00F8;j et al., 2016</xref>). To achieve this, we divided the human reference genome in windows comprising a total of 100 CpG dinucleotides and counted the number of such windows covered by at least one sequencing read in both alignment conditions. Although both alignment conditions identified read alignments in the vast majority of such genomic windows, we found that the total number of windows returning non-null coverage was larger for Bowtie2 than for BWA (<xref ref-type="fig" rid="F5">Figure 5A</xref>), in line with the increased coverage observed with both simulated and real data with this mapper. This demonstrates that Bowtie2 retrieves data within regions for which no sequencing data could be aligned with BWA, thereby extending the genomic contexts into which DNA methylation can be estimated. Additionally, the distribution of sequencing depth obtained across genomic windows of 100 CpG dinucleotides was shifted toward larger values when applying Bowtie2 instead of BWA (<xref ref-type="fig" rid="F5">Figure 5B</xref>, in which dashed lines indicate mean depth of coverage). This indicates that regional DNA methylation inference based on Bowtie2 alignments can build on more data than when based on BWA. This is important as the inference accuracy for ancient DNA methylation levels was previously shown to improve with sequencing depth (<xref ref-type="bibr" rid="B16">Hangh&#x00F8;j et al., 2019</xref>).</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Impact of read alignment conditions on ancient DNA methylation inference. The analyses were carried out using the sequence data generated following USER treatment of the raw DNA extracts of the three ancient Sunghir individuals (SI, left; SIII, center, and; SIV, right). The consequences of two mapping conditions on DNA methylation inference are investigated (BWA disabling seeding, BWA ds versus Bowtie2 end-to-end sensitive, Bowtie2 es). <bold>(A)</bold> Venn diagram of genomic windows showing non-null sequence coverage. Numbers indicate the total of genomic windows comprising 100 CpG dinucleotides and for which non-null sequence coverage was observed. Most windows are covered in both mapping conditions, but a fraction was exclusively identified by only one read aligner. Each circle is not scaled proportionally to increase readability. <bold>(B)</bold> Coverage distribution of genomic windows containing 100 CpG dinucleotides. Dashed lines represent the mean depth, respectively. <bold>(C)</bold> Distribution of the difference observed between the DNA methylation values inferred by two mapping conditions (Delta F). The difference (Delta F) reported corresponds to the difference between the <italic>f</italic> values estimated from Bowtie2 es alignments and those estimated from BWA ds alignments (f.Bowtie2 es - f.BWA ds).</p></caption>
<graphic xlink:href="fevo-08-00105-g005.tif"/>
</fig>
<p>We next used DamMet (<xref ref-type="bibr" rid="B16">Hangh&#x00F8;j et al., 2019</xref>) to calculate in both alignment conditions the DNA methylation levels, f, for those genomic windows encompassing 100 CpG dinucleotides (<xref ref-type="fig" rid="F5">Figure 5C</xref>). We found that the distributions of differences between the <italic>f</italic> values returned from Bowtie2 and BWA read alignments were centered around zero, indicating that the two mapping conditions resulted in similar regional methylation estimates. However, a fraction of the windows considered returned <italic>f</italic> values of one (i.e., full methylation) when using the sequence data aligned with BWA and values of zero (i.e., full demethylation) when using Bowtie2 alignments. This represented a fraction of 0.040&#x2013;0.103% of the windows across the three ancient individuals investigated. Reciprocally, a fraction of the windows considered returned <italic>f</italic> values of zero when using the sequence data aligned with BWA and values of one when using Bowtie2 alignments. This represented a larger fraction of the windows across the three ancient individuals investigated (0.316&#x2013;1.392%), which demonstrates the increased sensitivity of the Bowtie2 aligner for those reads carrying CpG&#x2192;TpG substitutions and informing on regional methylation levels. This demonstrates that the alignment procedures can significantly impact on the inference of regional DNA methylation levels.</p>
</sec>
</sec>
<sec id="S4">
<title>Discussion</title>
<p>In this study, we report that Bowtie2 shows a higher performance than BWA when aligning ancient DNA data generated following USER treatment. This effect is especially pronounced within the shorter size range (25&#x2013;70 bp), due to the combined effects of a higher sensitivity for the Bowtie2 read aligner, and more conservative mapping quality scores for the BWA aligner. Moreover, in the absence of USER treatment, BWA mem was found to impact positively the coverage estimates for some samples, but negatively for others. This may be related to the respective representation of ultrashort templates among the sequencing data, as the most positive impact is found for those libraries showing the shortest average sizes. Although this remains to be tested systematically, it suggests that individual post-mortem DNA preservation conditions will significantly affect the performance of this alignment procedure. Nonetheless, the vast majority of DNA fragments retrieved from archeological and paleontological remains are of limited sizes; the mapping conditions investigated with Bowtie2 can be expected to substantially improve genome coverage estimates and, hence, data quality. Using the sequence data obtained from three Upper Paleolithic Sunghir individuals, we found that Bowtie2 could improve the genome average depth of coverage by up to 1.62&#x2013;3.72%. This improvement may appear modest at first glance but represents a significant improvement for ancient DNA research as the material available for destructive DNA extraction is finite and not replaceable. Additionally, improving sequencing depth is not always possible due to the limited molecular complexity of the DNA libraries available for sequencing. Importantly, read alignment conditions were found to impact not only depth of coverage but also the inference of regional ancient DNA methylation levels.</p>
<p>While ancient DNA methylation has received increasing scholar attention over the last 5 years, and while several statistical inference methods have been developed (e.g., BindDB, <xref ref-type="bibr" rid="B25">Livyatan et al., 2015</xref>; epiPaleomix, <xref ref-type="bibr" rid="B17">Hangh&#x00F8;j et al., 2016</xref>; and DamMet, <xref ref-type="bibr" rid="B16">Hangh&#x00F8;j et al., 2019</xref>), how much different read alignment methods could impact on methylation predictions had not been investigated. This study reveals that the Bowtie2 mapping conditions recommended (sensitive option, end-to-end mode) returns with larger numbers of read alignments that increase the number of genomic windows available for inference as well as their sequence coverage, which improves accuracy of the predictions. This has important consequences for the nascent field of ancient epigenomics, which, to the best of our knowledge, based all previous predictions on BWA DNA alignments. The fact that different read alignment conditions significantly impact the inference of ancient DNA methylation levels also implies that strictly identical alignment procedures are used when comparing DNA methylation levels in different ancient remains, including when groups showing different evolutionary distances to the reference genome used for alignments are considered (e.g., archaic hominins and anatomically modern humans, <xref ref-type="bibr" rid="B11">Gokhman et al., 2019</xref>).</p>
<p>Recent work has revealed that mapping ancient, ultrashort, and damaged ancient DNA reads against a linear reference genome can introduce substantial reference bias in the data, with possible impact on downstream population genetics inference (<xref ref-type="bibr" rid="B13">G&#x00FC;nther and Nettelblad, 2019</xref>). Alignment procedures including a variation graph recapitulating the known genetic variation within a panel of modern individuals further mitigated this bias and helped effectively recover non-reference variants (<xref ref-type="bibr" rid="B28">Martiniano et al., 2019</xref>). Future work should focus on assessing the impact of such alignment procedures on ancient DNA methylation inference.</p>
</sec>
<sec id="S5">
<title>Data Availability Statement</title>
<p>The datasets generated for this study can be found in the ENA Accession number Sunghir: <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="PRJEB22592">PRJEB22592</ext-link>, ENA Accession number NE1: <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="PRJNA240906">PRJNA240906</ext-link>.</p>
</sec>
<sec id="S6">
<title>Ethics Statement</title>
<p>Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.</p>
</sec>
<sec id="S7">
<title>Author Contributions</title>
<p>LO conceived the study and provided material and infrastructure, and wrote the manuscript. MP carried out the analyses, with significant input from LO and plotted the figures.</p>
</sec>
<sec id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The reviewer, PH, declared a past collaboration with one of the author, LO, to the handling Editor.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This work was supported by the Initiative d&#x2019;Excellence Chaires d&#x2019;attractivit&#x00E9;, Universit&#x00E9; de Toulouse (OURASI), AMADEUS (CNRS LIA), and the Villum Fonden miGENEPI research project. LO has received funding from the European Research Council (ERC) under the European Union&#x2019;s Horizon 2020 Research and Innovation Program (grant agreement no. 681605-PEGASUS).</p>
</fn>
</fn-group>
<ack>
<p>We are grateful to all members of the AGES research group (Archaeology, Genomics, Evolution and Societies) for discussions.</p>
</ack>
<sec id="S10" sec-type="supplementary material"><title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fevo.2020.00105/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fevo.2020.00105/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Table_1.docx" id="FS1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Allentoft</surname> <given-names>M. E.</given-names></name> <name><surname>Sikora</surname> <given-names>M.</given-names></name> <name><surname>Sj&#x00F6;gren</surname> <given-names>K.-G.</given-names></name> <name><surname>Rasmussen</surname> <given-names>S.</given-names></name> <name><surname>Rasmussen</surname> <given-names>M.</given-names></name> <name><surname>Stenderup</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Population genomics of Bronze Age Eurasia.</article-title> <source><italic>Nature</italic></source> <volume>522</volume>:<issue>167</issue>. <pub-id pub-id-type="doi">10.1038/nature14507</pub-id> <pub-id pub-id-type="pmid">26062507</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Briggs</surname> <given-names>A. W.</given-names></name> <name><surname>Stenzel</surname> <given-names>U.</given-names></name> <name><surname>Meyer</surname> <given-names>M.</given-names></name> <name><surname>Krause</surname> <given-names>J.</given-names></name> <name><surname>Kircher</surname> <given-names>M.</given-names></name> <name><surname>P&#x00E4;&#x00E4;bo</surname> <given-names>S.</given-names></name></person-group> (<year>2010</year>). <article-title>Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>38</volume>:<issue>e87</issue>. <pub-id pub-id-type="doi">10.1093/nar/gkp1163</pub-id> <pub-id pub-id-type="pmid">20028723</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brunson</surname> <given-names>K.</given-names></name> <name><surname>Reich</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <article-title>The promise of paleogenomics beyond our own species.</article-title> <source><italic>Trends Genet.</italic></source> <volume>35</volume> <fpage>319</fpage>&#x2013;<lpage>329</lpage>. <pub-id pub-id-type="doi">10.1016/j.tig.2019.02.006</pub-id> <pub-id pub-id-type="pmid">30954285</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cahill</surname> <given-names>J. A.</given-names></name> <name><surname>Heintzman</surname> <given-names>P. D.</given-names></name> <name><surname>Harris</surname> <given-names>K.</given-names></name> <name><surname>Teasdale</surname> <given-names>M. D.</given-names></name> <name><surname>Kapp</surname> <given-names>J.</given-names></name> <name><surname>Soares</surname> <given-names>A. E. R.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Genomic evidence of widespread admixture from polar bears into brown bears during the last ice age.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>35</volume> <fpage>1120</fpage>&#x2013;<lpage>1129</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msy018</pub-id> <pub-id pub-id-type="pmid">29471451</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dabney</surname> <given-names>J.</given-names></name> <name><surname>Knapp</surname> <given-names>M.</given-names></name> <name><surname>Glocke</surname> <given-names>I.</given-names></name> <name><surname>Gansauge</surname> <given-names>M.-T.</given-names></name> <name><surname>Weihmann</surname> <given-names>A.</given-names></name> <name><surname>Nickel</surname> <given-names>B.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Complete mitochondrial genome sequence of a middle pleistocene cave bear reconstructed from ultrashort DNA fragments.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>110</volume> <fpage>15758</fpage>&#x2013;<lpage>15763</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1314445110</pub-id> <pub-id pub-id-type="pmid">24019490</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Damgaard</surname> <given-names>P. D. B.</given-names></name> <name><surname>Marchi</surname> <given-names>N.</given-names></name> <name><surname>Rasmussen</surname> <given-names>S.</given-names></name> <name><surname>Peyrot</surname> <given-names>M.</given-names></name> <name><surname>Renaud</surname> <given-names>G.</given-names></name> <name><surname>Korneliussen</surname> <given-names>T.</given-names></name><etal/></person-group> (<year>2018a</year>). <article-title>137 ancient human genomes from across the Eurasian steppes.</article-title> <source><italic>Nature</italic></source> <volume>557</volume> <fpage>369</fpage>&#x2013;<lpage>374</lpage>. <pub-id pub-id-type="doi">10.1038/s41586-018-0488-1</pub-id> <pub-id pub-id-type="pmid">30166629</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Damgaard</surname> <given-names>P. D. B.</given-names></name> <name><surname>Martiniano</surname> <given-names>R.</given-names></name> <name><surname>Kamm</surname> <given-names>J.</given-names></name> <name><surname>Moreno-Mayar</surname> <given-names>J. V.</given-names></name> <name><surname>Kroonen</surname> <given-names>G.</given-names></name> <name><surname>Peyrot</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2018b</year>). <article-title>The first horse herders and the impact of early Bronze Age steppe expansions into Asia.</article-title> <source><italic>Science</italic></source> <volume>360</volume>:<issue>eaar7711</issue>. <pub-id pub-id-type="doi">10.1126/science.aar7711</pub-id> <pub-id pub-id-type="pmid">29743352</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fagny</surname> <given-names>M.</given-names></name> <name><surname>Patin</surname> <given-names>E.</given-names></name> <name><surname>MacIsaac</surname> <given-names>J. L.</given-names></name> <name><surname>Rotival</surname> <given-names>M.</given-names></name> <name><surname>Flutre</surname> <given-names>T.</given-names></name> <name><surname>Jones</surname> <given-names>M. J.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>The epigenomic landscape of African rainforest hunter-gatherers and farmers.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>6</volume>:<issue>10047</issue>. <pub-id pub-id-type="doi">10.1038/ncomms10047</pub-id> <pub-id pub-id-type="pmid">26616214</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gamba</surname> <given-names>C.</given-names></name> <name><surname>Jones</surname> <given-names>E. R.</given-names></name> <name><surname>Teasdale</surname> <given-names>M. D.</given-names></name> <name><surname>McLaughlin</surname> <given-names>R. L.</given-names></name> <name><surname>Gonzalez-Fortes</surname> <given-names>G.</given-names></name> <name><surname>Mattiangeli</surname> <given-names>V.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Genome flux and stasis in a five millennium transect of European prehistory.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>5</volume>:<issue>5257</issue>. <pub-id pub-id-type="doi">10.1038/ncomms6257</pub-id> <pub-id pub-id-type="pmid">25334030</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gokhman</surname> <given-names>D.</given-names></name> <name><surname>Lavi</surname> <given-names>E.</given-names></name> <name><surname>Pr&#x00FC;fer</surname> <given-names>K.</given-names></name> <name><surname>Fraga</surname> <given-names>M. F.</given-names></name> <name><surname>Riancho</surname> <given-names>J. A.</given-names></name> <name><surname>Kelso</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Reconstructing the DNA Methylation maps of the neandertal and the denisovan.</article-title> <source><italic>Science</italic></source> <volume>344</volume> <fpage>523</fpage>&#x2013;<lpage>527</lpage>. <pub-id pub-id-type="doi">10.1126/science.1250368</pub-id> <pub-id pub-id-type="pmid">24786081</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gokhman</surname> <given-names>D.</given-names></name> <name><surname>Mishol</surname> <given-names>N.</given-names></name> <name><surname>de Manuel</surname> <given-names>M.</given-names></name> <name><surname>de Juan</surname> <given-names>D.</given-names></name> <name><surname>Shuqrun</surname> <given-names>J.</given-names></name> <name><surname>Meshorer</surname> <given-names>E.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Reconstructing denisovan anatomy using DNA Methylation maps.</article-title> <source><italic>Cell</italic></source> <volume>179</volume> <fpage>180</fpage>&#x2013;<lpage>192</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2020.01.020</pub-id> <pub-id pub-id-type="pmid">32032517</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Green</surname> <given-names>R. E.</given-names></name> <name><surname>Krause</surname> <given-names>J.</given-names></name> <name><surname>Briggs</surname> <given-names>A. W.</given-names></name> <name><surname>Maricic</surname> <given-names>T.</given-names></name> <name><surname>Stenzel</surname> <given-names>U.</given-names></name> <name><surname>Kircher</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>A draft sequence of the neandertal genome.</article-title> <source><italic>Science</italic></source> <volume>328</volume> <fpage>710</fpage>&#x2013;<lpage>722</lpage>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>G&#x00FC;nther</surname> <given-names>T.</given-names></name> <name><surname>Nettelblad</surname> <given-names>C.</given-names></name></person-group> (<year>2019</year>). <article-title>The presence and impact of reference bias on population genomic studies of prehistoric human populations.</article-title> <source><italic>PLoS Genet.</italic></source> <volume>15</volume>:<issue>e1008302</issue>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1008302</pub-id> <pub-id pub-id-type="pmid">31348818</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haak</surname> <given-names>W.</given-names></name> <name><surname>Lazaridis</surname> <given-names>I.</given-names></name> <name><surname>Patterson</surname> <given-names>N.</given-names></name> <name><surname>Rohland</surname> <given-names>N.</given-names></name> <name><surname>Mallick</surname> <given-names>S.</given-names></name> <name><surname>Llamas</surname> <given-names>B.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Massive migration from the steppe was a source for Indo-European languages in Europe.</article-title> <source><italic>Nature</italic></source> <volume>522</volume>:<issue>207</issue>. <pub-id pub-id-type="doi">10.1038/nature14317</pub-id> <pub-id pub-id-type="pmid">25731166</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hangh&#x00F8;j</surname> <given-names>K.</given-names></name> <name><surname>Orlando</surname> <given-names>L.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>Ancient epigenomics</article-title>,&#x201D; in <source><italic>Population Genomics</italic></source>, <role>ed.</role> <person-group person-group-type="editor"><name><surname>Rajora</surname> <given-names>O. P.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>).</citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hangh&#x00F8;j</surname> <given-names>K.</given-names></name> <name><surname>Renaud</surname> <given-names>G.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name> <name><surname>Orlando</surname> <given-names>L.</given-names></name></person-group> (<year>2019</year>). <article-title>DamMet: ancient methylome mapping accounting for errors, true variants, and post-mortem DNA damage.</article-title> <source><italic>Gigascience</italic></source> <volume>8</volume>:<issue>giz02</issue>. <pub-id pub-id-type="doi">10.1093/gigascience/giz025</pub-id> <pub-id pub-id-type="pmid">31004132</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hangh&#x00F8;j</surname> <given-names>K.</given-names></name> <name><surname>Seguin-Orlando</surname> <given-names>A.</given-names></name> <name><surname>Schubert</surname> <given-names>M.</given-names></name> <name><surname>Madsen</surname> <given-names>T.</given-names></name> <name><surname>Pedersen</surname> <given-names>J. S.</given-names></name> <name><surname>Willerslev</surname> <given-names>E.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Fast, accurate and automatic ancient nucleosome and methylation maps with epiPALEOMIX.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>33</volume> <fpage>3284</fpage>&#x2013;<lpage>3298</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msw184</pub-id> <pub-id pub-id-type="pmid">27624717</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>J&#x00F3;nsson</surname> <given-names>H.</given-names></name> <name><surname>Ginolhac</surname> <given-names>A.</given-names></name> <name><surname>Schubert</surname> <given-names>M.</given-names></name> <name><surname>Johnson</surname> <given-names>P. L. F.</given-names></name> <name><surname>Orlando</surname> <given-names>L.</given-names></name></person-group> (<year>2013</year>). <article-title>MapDamage2.0: fast approximate bayesian estimates of ancient DNA damage parameters.</article-title> <source><italic>Bioinformatics</italic></source> <volume>29</volume> <fpage>1682</fpage>&#x2013;<lpage>1684</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btt193</pub-id> <pub-id pub-id-type="pmid">23613487</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Langmead</surname> <given-names>B.</given-names></name> <name><surname>Salzberg</surname> <given-names>S. L.</given-names></name></person-group> (<year>2012</year>). <article-title>Fast gapped-read alignment with Bowtie 2.</article-title> <source><italic>Nat. Methods</italic></source> <volume>9</volume> <fpage>357</fpage>&#x2013;<lpage>359</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.1923</pub-id> <pub-id pub-id-type="pmid">22388286</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Laubach</surname> <given-names>Z. M.</given-names></name> <name><surname>Faulk</surname> <given-names>C. D.</given-names></name> <name><surname>Dolinoy</surname> <given-names>D. C.</given-names></name> <name><surname>Montrose</surname> <given-names>L.</given-names></name> <name><surname>Jones</surname> <given-names>T. R.</given-names></name> <name><surname>Ray</surname> <given-names>D.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Early life social and ecological determinants of global DNA methylation in wild spotted hyenas.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>28</volume> <fpage>3799</fpage>&#x2013;<lpage>3812</lpage>. <pub-id pub-id-type="doi">10.1111/mec.15174</pub-id> <pub-id pub-id-type="pmid">31291495</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lea</surname> <given-names>A. J.</given-names></name> <name><surname>Vockley</surname> <given-names>C. M.</given-names></name> <name><surname>Johnston</surname> <given-names>R. A.</given-names></name> <name><surname>Del Carpio</surname> <given-names>C. A.</given-names></name> <name><surname>Barreiro</surname> <given-names>L. B.</given-names></name> <name><surname>Reddy</surname> <given-names>T. E.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Genome-wide quantification of the effects of DNA methylation on human gene regulation.</article-title> <source><italic>eLife</italic></source> <volume>7</volume>:<issue>e37513</issue>. <pub-id pub-id-type="doi">10.7554/eLife.37513</pub-id> <pub-id pub-id-type="pmid">30575519</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Durbin</surname> <given-names>R.</given-names></name></person-group> (<year>2009</year>). <article-title>Fast and accurate short read alignment with Burrows-Wheeler transform.</article-title> <source><italic>Bioinformatics</italic></source> <volume>25</volume> <fpage>1754</fpage>&#x2013;<lpage>1760</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp324</pub-id> <pub-id pub-id-type="pmid">19451168</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Handsaker</surname> <given-names>B.</given-names></name> <name><surname>Wysoker</surname> <given-names>A.</given-names></name> <name><surname>Fennell</surname> <given-names>T.</given-names></name> <name><surname>Ruan</surname> <given-names>J.</given-names></name> <name><surname>Homer</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2009</year>). <article-title>The sequence alignment/Map format and SAMtools.</article-title> <source><italic>Bioinformatics</italic></source> <volume>25</volume> <fpage>2078</fpage>&#x2013;<lpage>2079</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp352</pub-id> <pub-id pub-id-type="pmid">19505943</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lindenbaum</surname> <given-names>P.</given-names></name></person-group> (<year>2015</year>). <source><italic>JVarkit: Java-Based Utilities for Bioinformatics.</italic></source> Available online at: <ext-link ext-link-type="uri" xlink:href="https://github.com/lindenb/jvarkit">https://github.com/lindenb/jvarkit</ext-link> <comment>(accessed July 22, 2019)</comment>.</citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Livyatan</surname> <given-names>I.</given-names></name> <name><surname>Aaronson</surname> <given-names>Y.</given-names></name> <name><surname>Gokhman</surname> <given-names>D.</given-names></name> <name><surname>Ashkenazi</surname> <given-names>R.</given-names></name> <name><surname>Meshorer</surname> <given-names>E.</given-names></name></person-group> (<year>2015</year>). <article-title>BindDB: an integrated database and webtool platform for &#x201C;reverse-ChIP&#x201D; epigenomic analysis.</article-title> <source><italic>Cell Stem Cell</italic></source> <volume>17</volume> <fpage>647</fpage>&#x2013;<lpage>648</lpage>. <pub-id pub-id-type="doi">10.1016/j.stem.2015.11.015</pub-id> <pub-id pub-id-type="pmid">26637941</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Llamas</surname> <given-names>B.</given-names></name> <name><surname>Holland</surname> <given-names>M. L.</given-names></name> <name><surname>Chen</surname> <given-names>K.</given-names></name> <name><surname>Cropley</surname> <given-names>J. E.</given-names></name> <name><surname>Cooper</surname> <given-names>A.</given-names></name> <name><surname>Suter</surname> <given-names>C. M.</given-names></name></person-group> (<year>2012</year>). <article-title>High-resolution analysis of cytosine Methylation in ancient DNA.</article-title> <source><italic>PLoS One</italic></source> <volume>7</volume>:<issue>e30226</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0030226</pub-id> <pub-id pub-id-type="pmid">22276161</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marciniak</surname> <given-names>S.</given-names></name> <name><surname>Perry</surname> <given-names>G. H.</given-names></name></person-group> (<year>2017</year>). <article-title>Harnessing ancient genomes to study the history of human adaptation.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>18</volume>:<issue>659</issue>. <pub-id pub-id-type="doi">10.1038/nrg.2017.65</pub-id> <pub-id pub-id-type="pmid">28890534</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martiniano</surname> <given-names>R.</given-names></name> <name><surname>Garrison</surname> <given-names>E.</given-names></name> <name><surname>Jones</surname> <given-names>E. R.</given-names></name> <name><surname>Manica</surname> <given-names>A.</given-names></name> <name><surname>Durbin</surname> <given-names>R.</given-names></name></person-group> (<year>2019</year>). <article-title>Removing reference bias in ancient DNA data analysis by mapping to a sequence variation graph.</article-title> <source><italic>BioRxiv</italic></source> [Preprint], <pub-id pub-id-type="doi">10.1101/782755</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mathieson</surname> <given-names>I.</given-names></name> <name><surname>Lazaridis</surname> <given-names>I.</given-names></name> <name><surname>Rohland</surname> <given-names>N.</given-names></name> <name><surname>Mallick</surname> <given-names>S.</given-names></name> <name><surname>Patterson</surname> <given-names>N.</given-names></name> <name><surname>Roodenberg</surname> <given-names>S. A.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Genome-wide patterns of selection in 230 ancient Eurasians.</article-title> <source><italic>Nature</italic></source> <volume>528</volume>:<issue>499</issue>. <pub-id pub-id-type="doi">10.1038/nature16152</pub-id> <pub-id pub-id-type="pmid">26595274</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Narasimhan</surname> <given-names>V. M.</given-names></name> <name><surname>Patterson</surname> <given-names>N.</given-names></name> <name><surname>Moorjani</surname> <given-names>P.</given-names></name> <name><surname>Rohland</surname> <given-names>N.</given-names></name> <name><surname>Bernardos</surname> <given-names>R.</given-names></name> <name><surname>Mallick</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>The formation of human populations in South and Central Asia.</article-title> <source><italic>Science</italic></source> <volume>365</volume>:<issue>eaat7487</issue>. <pub-id pub-id-type="doi">10.1126/science.aat7487</pub-id> <pub-id pub-id-type="pmid">31488661</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pacis</surname> <given-names>A.</given-names></name> <name><surname>Tailleux</surname> <given-names>L.</given-names></name> <name><surname>Morin</surname> <given-names>A. M.</given-names></name> <name><surname>Lambourne</surname> <given-names>J.</given-names></name> <name><surname>MacIsaac</surname> <given-names>J. L.</given-names></name> <name><surname>Yotova</surname> <given-names>V.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Bacterial infection remodels the DNA methylation landscape of human dendritic cells.</article-title> <source><italic>Genome Res.</italic></source> <volume>25</volume> <fpage>1801</fpage>&#x2013;<lpage>1811</lpage>. <pub-id pub-id-type="doi">10.1101/gr.192005.115</pub-id> <pub-id pub-id-type="pmid">26392366</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pedersen</surname> <given-names>J. S.</given-names></name> <name><surname>Valen</surname> <given-names>E.</given-names></name> <name><surname>Velazquez</surname> <given-names>A. M. V.</given-names></name> <name><surname>Parker</surname> <given-names>B. J.</given-names></name> <name><surname>Rasmussen</surname> <given-names>M.</given-names></name> <name><surname>Lindgreen</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome.</article-title> <source><italic>Genome Res.</italic></source> <volume>24</volume> <fpage>454</fpage>&#x2013;<lpage>466</lpage>. <pub-id pub-id-type="doi">10.1101/gr.163592.113</pub-id> <pub-id pub-id-type="pmid">24299735</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Quinlan</surname> <given-names>A. R.</given-names></name> <name><surname>Hall</surname> <given-names>I. M.</given-names></name></person-group> (<year>2010</year>). <article-title>BEDTools: a flexible suite of utilities for comparing genomic features.</article-title> <source><italic>Bioinformatics</italic></source> <volume>26</volume> <fpage>841</fpage>&#x2013;<lpage>842</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btq033</pub-id> <pub-id pub-id-type="pmid">20110278</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rasmussen</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Lindgreen</surname> <given-names>S.</given-names></name> <name><surname>Pedersen</surname> <given-names>J. S.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name> <name><surname>Moltke</surname> <given-names>I.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>Ancient human genome sequence of an extinct Palaeo-Eskimo.</article-title> <source><italic>Nature</italic></source> <volume>463</volume> <fpage>757</fpage>&#x2013;<lpage>762</lpage>. <pub-id pub-id-type="doi">10.1038/nature08835</pub-id> <pub-id pub-id-type="pmid">20148029</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reich</surname> <given-names>D.</given-names></name> <name><surname>Green</surname> <given-names>R. E.</given-names></name> <name><surname>Kircher</surname> <given-names>M.</given-names></name> <name><surname>Krause</surname> <given-names>J.</given-names></name> <name><surname>Patterson</surname> <given-names>N.</given-names></name> <name><surname>Durand</surname> <given-names>E. Y.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>Genetic history of an archaic hominin group from denisova cave in Siberia.</article-title> <source><italic>Nature</italic></source> <volume>468</volume> <fpage>1053</fpage>&#x2013;<lpage>1060</lpage>. <pub-id pub-id-type="doi">10.1038/nature09710</pub-id> <pub-id pub-id-type="pmid">21179161</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Renaud</surname> <given-names>G.</given-names></name> <name><surname>Hangh&#x00F8;j</surname> <given-names>K.</given-names></name> <name><surname>Willerslev</surname> <given-names>E.</given-names></name> <name><surname>Orlando</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>Gargammel: a sequence simulator for ancient DNA.</article-title> <source><italic>Bioinformatics</italic></source> <volume>33</volume> <fpage>577</fpage>&#x2013;<lpage>579</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btw670</pub-id> <pub-id pub-id-type="pmid">27794556</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Renaud</surname> <given-names>G.</given-names></name> <name><surname>Petersen</surname> <given-names>B.</given-names></name> <name><surname>Seguin-Orlando</surname> <given-names>A.</given-names></name> <name><surname>Bertelsen</surname> <given-names>M. F.</given-names></name> <name><surname>Waller</surname> <given-names>A.</given-names></name> <name><surname>Newton</surname> <given-names>R.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Improved de novo genomic assembly for the domestic donkey.</article-title> <source><italic>Sci. Adv.</italic></source> <volume>4</volume>:<issue>eaaq0392</issue>. <pub-id pub-id-type="doi">10.1126/sciadv.aaq0392</pub-id> <pub-id pub-id-type="pmid">29740610</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rohland</surname> <given-names>N.</given-names></name> <name><surname>Harney</surname> <given-names>E.</given-names></name> <name><surname>Mallick</surname> <given-names>S.</given-names></name> <name><surname>Nordenfelt</surname> <given-names>S.</given-names></name> <name><surname>Reich</surname> <given-names>D.</given-names></name></person-group> (<year>2015</year>). <article-title>Partial uracil &#x2013; DNA &#x2013; glycosylase treatment for screening of ancient DNA.</article-title> <source><italic>Philos. Trans. R. Soc. B Biol. Sci.</italic></source> <volume>370</volume>:<issue>20130624</issue>. <pub-id pub-id-type="doi">10.1098/rstb.2013.0624</pub-id> <pub-id pub-id-type="pmid">25487342</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><collab>RStudio Team</collab> (<year>2016</year>). <source><italic>RStudio: Integrated Development for R.</italic></source> <publisher-loc>Boston, MA</publisher-loc>: <publisher-name>RStudio, Inc</publisher-name>.</citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Santos</surname> <given-names>H. P.</given-names></name> <name><surname>Bhattacharya</surname> <given-names>A.</given-names></name> <name><surname>Martin</surname> <given-names>E. M.</given-names></name> <name><surname>Addo</surname> <given-names>K.</given-names></name> <name><surname>Psioda</surname> <given-names>M.</given-names></name> <name><surname>Smeester</surname> <given-names>L.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Epigenome-wide DNA methylation in placentas from preterm infants: association with maternal socioeconomic status.</article-title> <source><italic>Epigenetics</italic></source> <volume>14</volume> <fpage>751</fpage>&#x2013;<lpage>765</lpage>. <pub-id pub-id-type="doi">10.1080/15592294.2019.1614743</pub-id> <pub-id pub-id-type="pmid">31062658</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sanz</surname> <given-names>J.</given-names></name> <name><surname>Maurizio</surname> <given-names>P. L.</given-names></name> <name><surname>Snyder-Mackler</surname> <given-names>N.</given-names></name> <name><surname>Simons</surname> <given-names>N. D.</given-names></name> <name><surname>Voyles</surname> <given-names>T.</given-names></name> <name><surname>Kohn</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Social history and exposure to pathogen signals modulate social status effects on gene regulation in rhesus macaques.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> 201820846. <pub-id pub-id-type="doi">10.1073/pnas.1820846116</pub-id> <pub-id pub-id-type="pmid">31611381</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schubert</surname> <given-names>M.</given-names></name> <name><surname>Ermini</surname> <given-names>L.</given-names></name> <name><surname>Sarkissian</surname> <given-names>C. Der</given-names></name> <name><surname>J&#x00F3;nsson</surname> <given-names>H.</given-names></name> <name><surname>Ginolhac</surname> <given-names>A.</given-names></name> <name><surname>Schaefer</surname> <given-names>R.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX.</article-title> <source><italic>Nat. Protoc.</italic></source> <volume>9</volume>:<issue>1056</issue>. <pub-id pub-id-type="doi">10.1038/nprot.2014.063</pub-id> <pub-id pub-id-type="pmid">24722405</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schubert</surname> <given-names>M.</given-names></name> <name><surname>Ginolhac</surname> <given-names>A.</given-names></name> <name><surname>Lindgreen</surname> <given-names>S.</given-names></name> <name><surname>Thompson</surname> <given-names>J. F.</given-names></name> <name><surname>Al-Rasheid</surname> <given-names>K. A. S.</given-names></name> <name><surname>Willerslev</surname> <given-names>E.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>Improving ancient DNA read mapping against modern reference genomes.</article-title> <source><italic>BMC Genomics</italic></source> <volume>13</volume>:<issue>178</issue>. <pub-id pub-id-type="doi">10.1186/1471-2164-13-178</pub-id> <pub-id pub-id-type="pmid">22574660</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schubert</surname> <given-names>M.</given-names></name> <name><surname>Lindgreen</surname> <given-names>S.</given-names></name> <name><surname>Orlando</surname> <given-names>L.</given-names></name></person-group> (<year>2016</year>). <article-title>AdapterRemoval v2: rapid adapter trimming, identification, and read merging.</article-title> <source><italic>BMC Res. Notes</italic></source> <volume>9</volume>:<issue>88</issue>. <pub-id pub-id-type="doi">10.1186/s13104-016-1900-2</pub-id> <pub-id pub-id-type="pmid">26868221</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seguin-Orlando</surname> <given-names>A.</given-names></name> <name><surname>Gamba</surname> <given-names>C.</given-names></name> <name><surname>Sarkissian</surname> <given-names>C. Der</given-names></name> <name><surname>Ermini</surname> <given-names>L.</given-names></name> <name><surname>Louvel</surname> <given-names>G.</given-names></name> <name><surname>Boulygina</surname> <given-names>E.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Pros and cons of methylation-based enrichment methods for ancient DNA.</article-title> <source><italic>Sci. Rep.</italic></source> <volume>5</volume>:<issue>11826</issue>. <pub-id pub-id-type="doi">10.1038/srep11826</pub-id> <pub-id pub-id-type="pmid">26134828</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>S&#x00E9;gurel</surname> <given-names>L.</given-names></name> <name><surname>Bon</surname> <given-names>C.</given-names></name></person-group> (<year>2017</year>). <article-title>On the evolution of lactase persistence in humans.</article-title> <source><italic>Annu. Rev. Genomics Hum. Genet.</italic></source> <volume>18</volume> <fpage>297</fpage>&#x2013;<lpage>319</lpage>.</citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname> <given-names>W.</given-names></name> <name><surname>Le</surname> <given-names>S.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Hu</surname> <given-names>F.</given-names></name></person-group> (<year>2016</year>). <article-title>SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation.</article-title> <source><italic>PLoS One</italic></source> <volume>11</volume>:<issue>e0163962</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0163962</pub-id> <pub-id pub-id-type="pmid">27706213</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sikora</surname> <given-names>M.</given-names></name> <name><surname>Seguin-Orlando</surname> <given-names>A.</given-names></name> <name><surname>Sousa</surname> <given-names>V. C.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name> <name><surname>Korneliussen</surname> <given-names>T.</given-names></name> <name><surname>Ko</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Ancient genomes show social and reproductive behavior of early upper paleolithic foragers.</article-title> <source><italic>Science</italic></source> <volume>358</volume> <fpage>659</fpage>&#x2013;<lpage>662</lpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2017.02247</pub-id> <pub-id pub-id-type="pmid">29313851</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname> <given-names>O.</given-names></name> <name><surname>Clapham</surname> <given-names>A. J.</given-names></name> <name><surname>Rose</surname> <given-names>P.</given-names></name> <name><surname>Liu</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Allaby</surname> <given-names>R. G.</given-names></name></person-group> (<year>2014</year>). <article-title>Genomic methylation patterns in archaeological barley show de-methylation as a time-dependent diagenetic process.</article-title> <source><italic>Sci. Rep.</italic></source> <volume>4</volume>:<issue>5559</issue>. <pub-id pub-id-type="doi">10.1038/srep05559</pub-id> <pub-id pub-id-type="pmid">24993353</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname> <given-names>R. W. A.</given-names></name> <name><surname>Monroe</surname> <given-names>C.</given-names></name> <name><surname>Bolnick</surname> <given-names>D. A.</given-names></name></person-group> (<year>2015</year>). <article-title>Detection of cytosine methylation in ancient DNA from five native american populations using bisulfite sequencing.</article-title> <source><italic>PLoS One</italic></source> <volume>10</volume>:<issue>e0125344</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0125344</pub-id> <pub-id pub-id-type="pmid">26016479</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Snyder-Mackler</surname> <given-names>N.</given-names></name> <name><surname>Sanz</surname> <given-names>J.</given-names></name> <name><surname>Kohn</surname> <given-names>J. N.</given-names></name> <name><surname>Voyles</surname> <given-names>T.</given-names></name> <name><surname>Pique-Regi</surname> <given-names>R.</given-names></name> <name><surname>Wilson</surname> <given-names>M. E.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Social status alters chromatin accessibility and the gene regulatory response to glucocorticoid stimulation in rhesus macaques.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>116</volume> <fpage>1219</fpage>&#x2013;<lpage>1228</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1811758115</pub-id> <pub-id pub-id-type="pmid">30538209</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><collab>The Genome Sequencing Consortium</collab> (<year>2001</year>). <article-title>Initial sequencing and analysis of the human genome.</article-title> <source><italic>Nature</italic></source> <volume>409</volume> <fpage>860</fpage>&#x2013;<lpage>921</lpage>. <pub-id pub-id-type="doi">10.1038/35057062</pub-id> <pub-id pub-id-type="pmid">11237011</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>C.-C.</given-names></name> <name><surname>Reinhold</surname> <given-names>S.</given-names></name> <name><surname>Kalmykov</surname> <given-names>A.</given-names></name> <name><surname>Wissgott</surname> <given-names>A.</given-names></name> <name><surname>Brandt</surname> <given-names>G.</given-names></name> <name><surname>Jeong</surname> <given-names>C.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Ancient human genome-wide data from a 3000-year interval in the caucasus corresponds with eco-geographic regions.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>10</volume>:<issue>590</issue>. <pub-id pub-id-type="doi">10.1038/s41467-018-08220-8</pub-id> <pub-id pub-id-type="pmid">30713341</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wickham</surname> <given-names>H.</given-names></name></person-group> (<year>2016</year>). <source><italic>Ggplot2: Elegant Graphics for Data Analysis.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer-Verlag</publisher-name>.</citation></ref>
</ref-list>
</back>
</article>
