<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2016.00777</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Genotyping-by-Sequencing SNP Identification for Crops without a Reference Genome: Using Transcriptome Based Mapping as an Alternative Strategy</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Berthouly-Salazar</surname> <given-names>C&#x00E9;cile</given-names></name>
<xref ref-type="author-notes" rid="fn001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/326590/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Mariac</surname> <given-names>C&#x00E9;dric</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/339051/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Couderc</surname> <given-names>Marie</given-names></name>
</contrib>
<contrib contrib-type="author">
<name><surname>Pouzadoux</surname> <given-names>Juliette</given-names></name>
</contrib>
<contrib contrib-type="author">
<name><surname>Floc&#x2019;h</surname> <given-names>Jean-Baptiste</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/349376/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Vigouroux</surname> <given-names>Yves</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/324655/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><institution>UMR Diversit&#x00E9;, Adaptation et D&#x00E9;veloppement des Plantes, Institut de Recherche pour le D&#x00E9;veloppement</institution> <country>Montpellier, France</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: <italic>Naiara Rodriguez-Ezpeleta, AZTI-Tecnalia, Spain</italic></p></fn>
<fn fn-type="edited-by"><p>Reviewed by: <italic>Baocheng Guo, University of Helsinki, Finland; Alicia Mastretta-Yanes, CONACYT &#x2013; CONABIO, Mexico</italic></p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x002A;Correspondence: <italic>C&#x00E9;cile Berthouly-Salazar, <email>cecile.berthouly@ird.fr</email></italic></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Plant Science</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>15</day>
<month>06</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="collection">
<year>2016</year>
</pub-date>
<volume>7</volume>
<elocation-id>777</elocation-id>
<history>
<date date-type="received">
<day>29</day>
<month>02</month>
<year>2016</year>
</date>
<date date-type="accepted">
<day>19</day>
<month>05</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2016 Berthouly-Salazar, Mariac, Couderc, Pouzadoux, Floc&#x2019;h and Vigouroux.</copyright-statement>
<copyright-year>2016</copyright-year>
<copyright-holder>Berthouly-Salazar, Mariac, Couderc, Pouzadoux, Floc&#x2019;h and Vigouroux</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Next-generation sequencing opens the way for genomic studies of diversity even for non-model crops and animals. Genome reduction techniques are becoming progressively more popular as they allow a fraction of the genome to be sequenced for multiple individuals and/or populations. These techniques are an efficient way to explore genome diversity in non-model crops and animals for which no reference genome is available. Genome reduction techniques emerged with the development of specific pipelines such as UNEAK (Universal Network Enabled Analysis Kit) and Stacks. However, even for non-model crops and animals, transcriptomes are easier to obtain, thereby making it possible to directly map reads. We investigate the direct use of transcriptome as an alternative strategy. Our specific objective was to compare SNPs obtained from the UNEAK pipeline as well as SNPs obtained by directly mapping genotyping-by-sequencing reads on a transcriptome. We assessed the feasibility of both SNP datasets, UNEAK and transcriptome mapping, to investigate the diversity of 91 samples of wild pearl millet sampled across its distribution area. Both approaches produced several tens of thousands of single nucleotide variants, but differed in the way the variants were identified, leading to differences in the frequency spectrum associated with marked differences in the assessment of diversity. Difference in the frequency spectrum significantly biased a large set of diversity analyses as well as detection of selection approaches. However, whatever the approach, we found very similar inference of genetic structure, with three major genetic groups from West, Central, and East Africa. For non-model crops, using transcriptome data as a reference is thus a particularly promising way to obtain a more thorough analysis of datasets generated using genome reduction techniques.</p>
</abstract>
<kwd-group>
<kwd>SNP</kwd>
<kwd>GBS</kwd>
<kwd>UNEAK</kwd>
<kwd>transcriptome</kwd>
<kwd>site frequency spectrum</kwd>
<kwd>pearl millet</kwd>
</kwd-group>
<contract-num rid="cn001">ANR-12-PDOC-009-01, ANR-13-BSV7-0017-01</contract-num>
<contract-sponsor id="cn001">Agence Nationale de la Recherche<named-content content-type="fundref-id">10.13039/501100001665</named-content></contract-sponsor>
<counts>
<fig-count count="4"/>
<table-count count="1"/>
<equation-count count="0"/>
<ref-count count="71"/>
<page-count count="11"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec><title>Introduction</title>
<p>In the last two decades, next-generation sequencing (NGS) technologies (<xref ref-type="bibr" rid="B44">Mardis</xref>, <xref ref-type="bibr" rid="B44">2008</xref>) have made the assembly of numerous new reference genomes possible (<xref ref-type="bibr" rid="B16">Ellegren, 2014</xref>). Yet, in the case of non-model organisms, accessing genome diversity remains a challenge. Sequencing only a fraction of a large genome has been proposed as a promising way of getting round this constraint (<xref ref-type="bibr" rid="B50">Narum et al., 2013</xref>). Reduced-representation library (RRL) sequencing approaches enable sequencing of a fraction of the genome as well as of homologous regions in a set of individuals. Among RRL techniques, two main approaches are widely used today: the RAD-seq approach (<xref ref-type="bibr" rid="B4">Baird et al., 2008</xref>; <xref ref-type="bibr" rid="B13">Davey et al., 2011</xref>) and the genotyping-by-sequencing (GBS) approach (<xref ref-type="bibr" rid="B17">Elshire et al., 2011</xref>) but several others are also available (e.g., PE-RAD, dd-RAD, 2b-RAD, ezRAD). GBS, like RAD-seq, reduces genome complexity through restriction digest, but offers a simplified and more cost-effective library preparation protocol (<xref ref-type="bibr" rid="B17">Elshire et al., 2011</xref>). These molecular techniques were developed at the same time as specific bioinformatics pipelines to handle the resulting NGS raw sequences. For instance, the Stacks pipeline was developed primarily for RAD-seq data (<xref ref-type="bibr" rid="B9">Catchen et al., 2011</xref>, <xref ref-type="bibr" rid="B8">2013</xref>), while the TASSEL pipeline was developed for the GBS approach (<xref ref-type="bibr" rid="B26">Glaubitz et al., 2014</xref>).</p>
<p>Therefore, even though RAD-seq and GBS data can be analyzed using either pipeline, they are preferentially analyzed using their original corresponding pipeline. There is also a preference for each RRL approach that depends on the &#x201C;scientific community&#x201D; concerned. For instance, RAD-seq is widely used for evolutionary history and conservation studies on wild organisms (<xref ref-type="bibr" rid="B37">Hohenlohe et al., 2013</xref>; <xref ref-type="bibr" rid="B59">Pujolar et al., 2014</xref>; <xref ref-type="bibr" rid="B10">Combosch and Vollmer, 2015</xref>), whereas GBS is used by researchers working on crops and domesticated animals. The TASSEL pipeline was thus primarily developed to handle low coverage sequencing for homozygote samples (<xref ref-type="bibr" rid="B26">Glaubitz et al., 2014</xref>) and to be used in genome wide association studies (<xref ref-type="bibr" rid="B49">Moumouni et al., 2015</xref>; <xref ref-type="bibr" rid="B66">Sonah et al., 2015</xref>; <xref ref-type="bibr" rid="B69">Upadhyaya et al., 2015</xref>). Even among crops, not all species are model organisms with a reference genome. When no reference is available, somewhat similar strategies are implemented in Stacks and TASSEL to identify SNPs. First, similar reads are identified and grouped together to create TAGs. Second, networks of TAGs are built to identify which TAGs could be considered as alternative copies of the same genomic loci. These steps depend on several parameters, such as minimum coverage, for a read to be considered as a TAG, or the number of mismatches between two TAGs to be considered as alternative copies of one locus or different loci. The TASSEL &#x201C;no reference genome&#x201D; pipeline is implemented in the UNEAK (Universal Network Enabled Analysis Kit) module (<xref ref-type="bibr" rid="B43">Lu et al., 2013</xref>). SNPs are identified by drawing simple networks of reciprocal TAGs that only differ by 1 bp mismatch. Significant effects of pipeline parameters on SNPs identified and population genetics inferences have been highlighted for Stacks (<xref ref-type="bibr" rid="B8">Catchen et al., 2013</xref>; <xref ref-type="bibr" rid="B48">Mastretta-Yanes et al., 2014</xref>; <xref ref-type="bibr" rid="B61">Rodr&#x00ED;guez-Ezpeleta et al., 2016</xref>). To our knowledge, the effects of the UNEAK calling approach on population genetics have not yet been investigated.</p>
<p>An alternative strategy would be to map genomic reads from RRL approaches directly on a transcriptome. Most non-model crops possess a transcriptome reference that was primarily built for transcriptome studies. While building a transcriptome was formerly challenging (<xref ref-type="bibr" rid="B46">Martin and Wang, 2011</xref>; <xref ref-type="bibr" rid="B27">G&#x00F3;ngora-Castillo and Buell, 2013</xref>), new tools are available today that make it possible to rapidly and efficiently obtain a new assembly (<xref ref-type="bibr" rid="B28">Grabherr et al., 2011</xref>). Transcriptomes enable access to longer sequences around SNPs, a very interesting feature for further SNP validation and access to an annotation of the genomic region. Thus, using a transcriptome reference to map reads from RRL approaches (<xref ref-type="bibr" rid="B62">Russell et al., 2013</xref>; <xref ref-type="bibr" rid="B10">Combosch and Vollmer, 2015</xref>) could be an interesting alternative for SNP discovery.</p>
<p>However, it is not easy to assess the bias arising from using the SNP calling pipeline, especially for population genetic studies (<xref ref-type="bibr" rid="B35">Hohenlohe et al., 2010</xref>; <xref ref-type="bibr" rid="B52">Nielsen et al., 2012</xref>; <xref ref-type="bibr" rid="B3">Arnold et al., 2013</xref>; <xref ref-type="bibr" rid="B12">Davey et al., 2013</xref>; <xref ref-type="bibr" rid="B25">Gautier et al., 2013</xref>; <xref ref-type="bibr" rid="B31">Han et al., 2014</xref>; <xref ref-type="bibr" rid="B38">Ilut et al., 2014</xref>; <xref ref-type="bibr" rid="B33">Harvey et al., 2015</xref>; <xref ref-type="bibr" rid="B61">Rodr&#x00ED;guez-Ezpeleta et al., 2016</xref>). Therefore, in the following, we compare two sets of SNPs obtained from wild pearl millet populations using GBS sequencing. The first set of SNPs was obtained through the UNEAK pipeline without a reference genome and the second set was obtained through a mapping pipeline to the pearl millet transcriptome. We therefore investigated the differences and congruence in SNPs called for the assessment of population structure and analysis of genetic diversity.</p>
</sec>
<sec id="s1" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec><title>Plant Material</title>
<p>We selected 48 wild pearl millet populations [<italic>Pennisetum glaucum</italic> (L.) R. Br. ssp. <italic>monodii</italic>] from a collection held at IRD (Institut de Recherche pour le D&#x00E9;veloppement, Montpellier, France). The 48 populations were chosen to cover the known distribution of wild pearl millet (<bold>Figure <xref ref-type="fig" rid="F1">1</xref></bold>). Seeds were grown in the greenhouse until flowering, and inflorescences from 10 plants per population were collected for DNA extraction. DNA was extracted using the MATAB protocol (a modified CTAB/&#x03B2;-mercaptoethanol method; <xref ref-type="bibr" rid="B45">Mariac et al., 2006</xref>). A set of 95 DNA normalized to 100 ng/&#x03BC;l (sample size per population &#x2264; 2) was sent to the Institute for Genomic Diversity at Cornell University<sup><xref ref-type="fn" rid="fn01">1</xref></sup> for GBS genotyping. Details on GBS protocol details can be found elsewhere (<xref ref-type="bibr" rid="B17">Elshire et al., 2011</xref>; <xref ref-type="bibr" rid="B11">Cronn et al., 2012</xref>). Genomic libraries were constructed using <italic>ApeKI</italic> restriction enzyme. The resulting 95-plex library was sequenced with an Illumina HiSeq2000. Four samples were not used for subsequent analyses due to the high rate of missing genotypes (>70%).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p><bold>Geographical distribution of the 48 populations of wild pearl millet</bold>.</p></caption>
<graphic xlink:href="fpls-07-00777-g001.tif"/>
</fig>
</sec>
<sec><title>SNP Discovery and Genotype Calling</title>
<sec><title>UNEAK Pipeline</title>
<p>Raw sequences were processed with a modification of the TASSEL-GBS pipeline (<xref ref-type="bibr" rid="B26">Glaubitz et al., 2014</xref>): the UNEAK pipeline (<xref ref-type="bibr" rid="B43">Lu et al., 2013</xref>). With the UNEAK pipeline, the alignment of TAGs to a reference genome is replaced by the creation of a pair of TAGs and network filtering to enable SNP discovery (<xref ref-type="bibr" rid="B43">Lu et al., 2013</xref>). Briefly, good reads were defined as reads carrying a perfect barcode match with no Ns in the 64 bp following the barcode. Reads were subsequently trimmed to 64 bp (excluding barcodes). Unique 64-bp sequence TAGs that were present five or more times across all samples were retained and used to identify &#x201C;TAG pairs,&#x201D; with a default error tolerance rate (ETR) of 0.03, as described in <xref ref-type="bibr" rid="B43">Lu et al. (2013)</xref>. Reciprocal &#x201C;TAG pairs&#x201D; with only 1 bp mismatch were considered as putative SNPs. Likelihood scores for each possible genotype were calculated according to formula 3.8 of <xref ref-type="bibr" rid="B18">Etter et al. (2011)</xref> and the most likely genotype was assigned. SNPs with a minor allele frequency (MAF) below 0.05 were excluded. Analyses were conducted with TASSEL version 3.0.157. The final set of SNPs (262,928) was then filtered for depth of coverage (DP) and for the percentage of missing data per SNP (&#x003C;10%). We use the median value of coverage across all the SNP as threshold for the DP filter.</p>
</sec>
<sec><title>Transcriptome Based Mapping (TM) Pipeline</title>
<p>The wild pearl millet transcriptome contains 50,313 contigs for a total of 36.5 MB. This transcriptome was built from RNA from early inflorescences when differential expression was not too pronounced. The average contig length is 725 bp &#x00B1; 732 bp (the transcriptome assembly<sup><xref ref-type="fn" rid="fn02">2</xref></sup>).</p>
<p>Raw sequences were first trimmed for low quality ends (&#x003C;20) and reads of less 35 bp were removed using Cutadapt 1.2.1 (<xref ref-type="bibr" rid="B47">Martin, 2011</xref>). Secondly, a filter on read mean quality was applied at a threshold of 30. Reads were mapped to the assembly with BWA version 0.7.5 (<xref ref-type="bibr" rid="B41">Li and Durbin, 2009</xref>) with &#x2013;n 3, allowing for a maximum number of three mismatches. Unmapped reads were removed using SAMtools version 0.1.17 (<xref ref-type="bibr" rid="B42">Li et al., 2009</xref>). We used RealignerTargetCreator and IndelRealigner from GATK version 2.4.7 (<xref ref-type="bibr" rid="B14">DePristo et al., 2011</xref>) to handle indels. SNPs and genotypes were called using UnifiedGenotyper. A total of 236,897 SNPs were then filtered for no more than three mismatches per 10 bp window, a HARD_TO_VALIDATE mapping quality (MQ) filter was applied [MQ0 &#x2265; 4 &#x0026;&#x0026; ((MQ0/(1.0 <sup>&#x2217;</sup> DP)) &#x003E; 0.1], and filtering was performed for QUAL (Quality) and QD (Quality by Depth) parameters which derived from Illumina quality scores (QUAL &#x2264; 60; QD &#x2264; 6.87 quantile 5%). The 121,279 remaining SNPs were then filtered for DP using the median value, and the percentage of missing data per SNP (&#x2264;10%). It is important to note that the additional quality filters cannot be applied in the UNEAK pipeline since Illumina quality scores are not used and not kept through the pipeline. All command lines are available in Supplementary Data File <xref ref-type="supplementary-material" rid="SM3">S1</xref>, and datsetes are availbale at <ext-link ext-link-type="uri" xlink:href="https://sites.google.com/site/africropproject/data">https://sites.google.com/site/africropproject/data</ext-link>.</p>
</sec>
<sec><title>Overlap between the Two SNP Datasets</title>
<p>We aligned the Hapmap file of TAG sequences on the transcriptome using BWA version 0.7.5 (<xref ref-type="bibr" rid="B41">Li and Durbin, 2009</xref>) with &#x2013;n 3, allowing for a maximum number of three alignments to output. We only report TAGs that had a unique hit.</p>
<p>In order to identify SNPs shared by the two datasets, we identified TAGs among the 21,913 final UNEAK SNPs that aligned to the transcriptome and extract the SNP position. We then compared the position and the alleles to identify homolog SNPs in the TM dataset.</p>
</sec>
</sec>
<sec><title>Diversity Statistics and Population Genetics Structure</title>
<p>We performed most analyses in the R environment (<xref ref-type="bibr" rid="B60">R Core Team, 2015</xref><sup><xref ref-type="fn" rid="fn03">3</xref></sup>). We performed a principal component analysis (PCA) using SMARTPCA (<xref ref-type="bibr" rid="B55">Patterson et al., 2006</xref>; <xref ref-type="bibr" rid="B57">Price et al., 2006</xref>) as implemented in the R package SNPRelate (<xref ref-type="bibr" rid="B71">Zheng et al., 2012</xref>). We used the R package Adegenet (<xref ref-type="bibr" rid="B39">Jombart and Ahmed, 2011</xref>) to estimate heterozygosity values, and the R package Pegas (<xref ref-type="bibr" rid="B54">Paradis, 2010</xref>) for F-statistics. We used the sNMF software to identify population structure (<xref ref-type="bibr" rid="B21">Frichot et al., 2014</xref>). This software gives similar results to those obtained with STRUCTURE (<xref ref-type="bibr" rid="B58">Pritchard et al., 2000</xref>) but it is much faster and can handle a very large number of SNPs. Finally, the folded site frequency spectrum (SFS) was calculated and used to estimate &#x0398;<sub>w</sub>, &#x0398;<sub>&#x03C0;</sub> and Tajima&#x2019;s <italic>D</italic> (<xref ref-type="bibr" rid="B67">Tajima, 1989</xref>). In addition, we estimated the SFS expected for a population at equilibrium in each dataset (<xref ref-type="bibr" rid="B23">Fu, 1995</xref>).</p>
</sec>
</sec>
<sec><title>Results</title>
<sec><title>Mapping and SNP Discovery</title>
<p>Both pipelines produced a similarly high number of SNPs. With the UNEAK pipeline, we were able to identify 262,928 biallelic SNPs. After filtering for depth (DP &#x2264; 51, 50.5% filtered) and missing data (NA &#x2265; 0.1, 41.2% filtered), we obtained 21,913 good quality SNPs. With the TM approach, a total of 16,399,078 cleaned reads with a mean size of 92 bp mapped on 36 918 contigs. The mean coverage was 41.33 &#x00B1; 44.2 and the mean MQ was 24.5. We identified 238,897 biallelic SNPs with a median depth of 90&#x00D7;, after filtering we obtained a total of 22,262 good quality SNPs. Specific filters (SNP clustering, mapping and quality filters) from TM pipeline removed nearly 50% of SNPs, while subsequent filters for depth (DP &#x2264; 90) SNPs and missing data (NA &#x2265; 0.1) removed 25 and 13.5%, respectively (<bold>Figure <xref ref-type="fig" rid="F2">2</xref></bold>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p><bold>Proportions of SNPs removed for each filter applied for both datasets UNEAK and TM</bold>.</p></caption>
<graphic xlink:href="fpls-07-00777-g002.tif"/>
</fig>
<p>The final sets of SNPs revealed that the quality of the two approaches was equivalent. The UNEAK final set of 21,913 SNPs had a mean DP per site and per sample of 7.24 &#x00B1; 3.63 sd and an average missing rate per sample of 0.04 &#x00B1; 0.04 sd. The TM final set of 22,262 SNPs had a mean DP per site and per sample of 8.68 &#x00B1; 12 sd and an average missing rate per sample of 0.03 &#x00B1; 0.03 sd. Within the TM final set, 56% of SNPs were found within a distance of 64 bp. The missing rates per sample between the UNEAK and TM dataset were highly correlated (<italic>r</italic> = 0.95). However, we had an average of 70% inflate number of missing data with UNEAK, since the average missing rates UNEAK:TM ratio was 1.7 &#x00B1; 1.8 sd.</p>
<p>In addition, we tested direct mapping of the 262,928 UNEAK TAG 64 bp on the transcriptome. A total of 21,410 TAG loci (8%) mapped on 13,177 transcriptome contigs (26%). The mapping was relatively good since 94% of the mapped TAGs had a unique hit, among which 96% had a perfect 64 bp match. The mean MQ of these unique hits was 34 &#x00B1; 9 sd. Among the 21,943 good quality TAGs, we found 3,146 TAGS (14%) that had good alignment on 2,382 (5%) contigs. Among those, we retrieved 822 SNPs common to the two datasets. Nearly all UNEAK SNPs had a MAF &#x003E; 0.05 (<bold>Supplementary Figure <xref ref-type="supplementary-material" rid="SM1">S1</xref></bold> and <bold>Table <xref ref-type="supplementary-material" rid="SM2">S1</xref></bold>). The correlation coefficient between allele frequencies estimated by both pipelines for shared SNPs was very strong (<italic>r</italic> = 0.98).</p>
</sec>
<sec><title>Genetic Structure and Genetic Diversity</title>
<p>The two datasets showed very similar inference of genetic structure. We identified <italic>K</italic> = 3 grouping populations geographically in a Western, Center, and Eastern clusters with both datasets (<bold>Figure <xref ref-type="fig" rid="F3">3</xref></bold>). Correlations between admixture values from both approaches within each cluster were high with <italic>r</italic> &#x003E; 0.99. The results of a PCA were similar (<bold>Figure <xref ref-type="fig" rid="F3">3</xref></bold>). Both datasets showed the same three geographic clusters and the correlation between PCA coordinates was very high (<italic>r</italic> &#x003E; 0.99). Comparing UNEAK and TM PCA, only one sample (sample 5726B1) was in a different position in the two plots. This individual had 17 times more missing data with the UNEAK dataset than with the TM dataset despite missing rates &#x003C;0.05%. This very high ratio of missing data between datasets might explain its outlier status. More generally, the regression of PCA coordinates between the two pipelines showed that most individuals qualified as slight outliers had three times more missing data in the UNEAK pipeline than in the TM pipeline. However, overall, we observed very good individual quality and a very strong congruent inference of population structure irrespective which pipeline was used.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p><bold>Population structure inferences for both SNPs datasets UNEAK (left) and TM (right) using sNMF software (A,B) and PCA (C,D).</bold> Western cluster is in red, Central cluster is in blue, and Eastern cluster in green.</p></caption>
<graphic xlink:href="fpls-07-00777-g003.tif"/>
</fig>
<p>In contrast, genetic diversity assessment was affected differently depending on the pipeline. Heterozygosity values were almost two times higher with the UNEAK dataset than with the TM dataset (<bold>Table <xref ref-type="table" rid="T1">1</xref></bold>). For F-statistics, <italic>F</italic><sub>IS</sub> was slightly but significantly higher with the TM dataset and <italic>F</italic><sub>ST</sub> was significantly (two times) lower. When we compared observed SFS and expected SFS for a population at equilibrium, the UNEAK dataset clearly did not retrieve the expected amount of low frequency SNPs (<bold>Figure <xref ref-type="fig" rid="F4">4</xref></bold>). On the other hand, TM SFS appeared to overestimate their number. As a result, &#x0398;<sub>&#x03C0;</sub> was 2.2 times higher with the UNEAK dataset and Tajima&#x2019;s <italic>D</italic>-values consequently differed considerably with a positive Tajima&#x2019;s <italic>D</italic>-value of 2.74 for UNEAK and negative value of &#x2013;0.65 for TM dataset.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p><bold>Folded sites frequency spectrum for both SNP datasets: (A) TM (in red) and its expected neutral SFS in black; (B) UNEAK (in blue) and its expected neutral SFS in black</bold>.</p></caption>
<graphic xlink:href="fpls-07-00777-g004.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Summary of diversity statistics for the two SNPs datasets.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"></th>
<th valign="top" align="center" colspan="3">UNEAK<hr/></th>
<th valign="top" align="center" colspan="3">TM<hr/></th>
<th valign="top" align="center"><italic>P</italic>-value</th>
</tr>
<tr>
<th valign="top" align="left"></th>
<th valign="top" align="center">Mean</th>
<th valign="top" align="center">Median</th>
<th valign="top" align="center">Standard deviation</th>
<th valign="top" align="center">Mean</th>
<th valign="top" align="center">Median</th>
<th valign="top" align="center">Standard deviation</th>
<th valign="top" align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">H<sub>Exp</sub></td>
<td valign="top" align="center">0.28</td>
<td valign="top" align="center">0.25</td>
<td valign="top" align="center">0.13</td>
<td valign="top" align="center">0.12</td>
<td valign="top" align="center">0.04</td>
<td valign="top" align="center">0.15</td>
<td valign="top" align="center">&#x003C;0.0001</td>
</tr>
<tr>
<td valign="top" align="left">H<sub>Obs</sub></td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.13</td>
<td valign="top" align="center">0.13</td>
<td valign="top" align="center">0.09</td>
<td valign="top" align="center">0.03</td>
<td valign="top" align="center">0.14</td>
<td valign="top" align="center">&#x003C;0.0001</td>
</tr>
<tr>
<td valign="top" align="left"><italic>F</italic><sub>ST</sub></td>
<td valign="top" align="center">0.22</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.18</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.04</td>
<td valign="top" align="center">0.18</td>
<td valign="top" align="center">&#x003C;0.0001</td>
</tr>
<tr>
<td valign="top" align="left"><italic>F</italic><sub>IS</sub></td>
<td valign="top" align="center">0.34</td>
<td valign="top" align="center">0.37</td>
<td valign="top" align="center">0.28</td>
<td valign="top" align="center">0.39</td>
<td valign="top" align="center">0.44</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">&#x003C;0.0001</td></tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec><title>Discussion</title>
<p>In this study, we compared two bioinformatics pipelines and their impact on population genetics statistics. Investigating genomic diversity is still challenging for non-model organisms with large genomes. RRL sequencing approaches, such as RNA-seq and GBS approaches, have been proposed to reduce genome complexity. NGS data obtained can be handled by different pipelines including Stacks and TASSEL. Here, we preferentially used the TASSEL pipeline because is the most commonly used pipeline for crops studies. We therefore first used the UNEAK approach implemented in TASSEL and proposed and tested an alternative strategy in which NGS genomic reads were directly mapped on the pearl millet transcriptome. This strategy was guided by the observation that species transcriptomes are becoming progressively more accessible thanks to transcriptional studies and that it would be advantageous to use it (<xref ref-type="bibr" rid="B62">Russell et al., 2013</xref>; <xref ref-type="bibr" rid="B10">Combosch and Vollmer, 2015</xref>). It makes it possible to avoid using the <italic>de novo</italic> DNA assembly and has the advantage of using a reference genome, for example to access a longer sequence around SNPs sites, and has a greater probability of finding selection targets (<xref ref-type="bibr" rid="B32">Hancock et al., 2011</xref>).</p>
<sec><title>GBS Reads Biased Toward Coding Regions</title>
<p>The quality of our two final datasets is as good as the datasets used in other population genetic studies with final coverage ranging from 5 to 10 and missing values rates below 0.3. Many RRL datasets may have low coverage in studies whose design aims for more individuals or loci to increase the accuracy of population genetic parameters (<xref ref-type="bibr" rid="B1">Alex Buerkle and Gompert, 2013</xref>).</p>
<p>Surprisingly, we found that non-negligible numbers of UNEAK TAGs mapped to the transcriptome. <xref ref-type="bibr" rid="B10">Combosch and Vollmer (2015)</xref> found that about 15% of RAD loci mapped to 10% of transcriptome contigs. Our results are similar with 8% of TAG loci that mapped to 26% of transcriptome contigs. We originally expected a very low mapping rate since we were only mapping to the expressed genome. One possible explanation is the choice of the restriction enzyme used. Our study, and many others, used <italic>ApeKI</italic> enzyme with the GBS approach (<xref ref-type="bibr" rid="B17">Elshire et al., 2011</xref>; <xref ref-type="bibr" rid="B43">Lu et al., 2013</xref>). Its methylation sensitivity made it possible to eliminate repetitive methylated genomics regions from the experiment (<xref ref-type="bibr" rid="B65">Sonah et al., 2013</xref>). In eukaryotes, non-methylated sites are preferentially found in coding regions (<xref ref-type="bibr" rid="B56">Phillips, 2008</xref>). In <italic>Populus</italic> populations, 27% of restriction sites from the whole genome were recovered using <italic>ApeKI</italic> for GBS and of which 70% fell into annotated genes (<xref ref-type="bibr" rid="B63">Schilling et al., 2014</xref>). In sweet cherry (<italic>Prunus avium</italic> L.) 66% of SNPs were found in genic regions (<xref ref-type="bibr" rid="B29">Guajardo et al., 2015</xref>). In the present study, based on a pearl millet genome estimated at 1.8 G (Xin Liu, BGI, personal communication) and the size of the reference transcriptome at 36.5 MB, we interrogated only 2% of the genome. We found that 6&#x2013;7% of reads per sample mapped to the transcriptome reference and 8% of UNEAK TAG loci were also aligned, which is three to four times more than the expected 2%. These results are in line with reports of an <italic>ApeKI</italic> enzyme bias toward coding regions in previous studies (<xref ref-type="bibr" rid="B63">Schilling et al., 2014</xref>; <xref ref-type="bibr" rid="B29">Guajardo et al., 2015</xref>).</p>
</sec>
<sec><title>Effect of Pipelines on SNPs Identified</title>
<p>Pipeline specifics influence the number of SNPs discovered and their distribution properties. There are major differences in how SNPs are called between pipelines, because pipelines deal somewhat differently with sequencing errors, base quality values, SNP calling and genotype calling methods and in our case, TAG catalog construction vs. transcriptome mapping. We now review some of the differences between the two approaches and how such differences could impact our results.</p>
<p>Among all the parameters that can affect SNPs discovery, coverage is one of the most important. For instance, error rates are expected to increase with low coverage (&#x003C;20&#x00D7;; <xref ref-type="bibr" rid="B2">Andrews and Luikart, 2014</xref>). To limit the impact of coverage in both our pipelines, we filtered SNPs with a depth above the median value for each dataset (51&#x00D7; for UNEAK, 90&#x00D7;for TM). Both final datasets had similar coverage and similar missing rates. Thus, it that sense, it would have little effect on number of SNPs discovered and population genetics estimates between datasets.</p>
<p>Another possible bias comes from repetitive regions in the genome, such as paralogs, and is not always easy to identify with NGS data. Different filters can be used to reduce the effect of unidentified paralogs. Paralogous regions are expected to align to multiple locations in the genome (<xref ref-type="bibr" rid="B36">Hohenlohe et al., 2012</xref>) and SNPs within paralogs genes are expected to show more than two alleles (<xref ref-type="bibr" rid="B20">Freedman et al., 2014</xref>). We only considered biallelic loci in the two datasets, since in RRL approaches, the problem of paralogs can be effectively addressed by ploidy-based filtering (<xref ref-type="bibr" rid="B38">Ilut et al., 2014</xref>). With the TM approach, we were able to apply an additional filter on MQ to reduce paralogous regions. However when mapping UNEAK TAGs to the transcriptome, we found that 94% of the TAGs that had a hit, mapped to a unique position. This suggests that even if no mapping filter can be applied, the probability of calling paralogs with the UNEAK pipeline is relatively low and the ploidy-based filtering thus appears to be sufficient to avoid paralog bias.</p>
<p>Statistical treatment of NGS sequences for a given genotype is based on assumed independent drawn of non-redundant read at a single gene. Several artifacts could bias the genotype likelihood because reads do not behave like the underlying statistical hypothesis: one read could be a duplicate (non-independent), an alternative allele could be missing (non-random draw) or mapping from two different but similar genes (not a single gene) on a single reference. Neither pipeline deals very easily with the occurrence of statistical non-independence of reads. Pipelines developed for RRL approaches were not able to handle allelic dropouts and mistake heterozygous presence/absence for homozygous presence/absence (<xref ref-type="bibr" rid="B12">Davey et al., 2013</xref>). A very recent pipeline for handling dominant and codominant markers has been developed (<xref ref-type="bibr" rid="B22">Fu et al., 2013</xref>). Yet with both of our approaches, a dominant marker (i.e., a mutation at the restriction site leading to allelic dropout) would have led to a homozygote call. Duplicate reads occur when, during DNA bank preparation, two reads derive from a single DNA by PCR duplication. PCR duplicates are by definition reads starting at the exact same mapping position. The effects of PCR duplicates on the estimation of population genetics have already been discussed (<xref ref-type="bibr" rid="B3">Arnold et al., 2013</xref>; <xref ref-type="bibr" rid="B12">Davey et al., 2013</xref>; <xref ref-type="bibr" rid="B25">Gautier et al., 2013</xref>). By construction, in RRL based on restriction enzymes, reads will start at the same mapping position which is the RE site, therefore applying PCR duplicates filter will not be possible unless a paired-end sequencing approach and random sheering is used (<xref ref-type="bibr" rid="B12">Davey et al., 2013</xref>) but recently a new protocol has been proposed by introducing &#x201C;adaptor tags&#x201D; allowing PCR duplicate discrimination (<xref ref-type="bibr" rid="B68">Tin et al., 2015</xref>). In conclusion, for both approaches we used, filtering for PCR duplicates was not possible and we therefore expected both UNEAK and TM datasets to underestimate heterozygosities. This is congruent with the strong correlation observed between estimated frequencies by both approaches for the shared SNPs.</p>
<p>The amount of SNPs allowed within a genomic windows is important since regions with too many SNPs are not reliable and may (i) contain many sequencing errors, (ii) be associated with paralogs. Within the TM pipeline, we applied a SNP clustering filter with no more than three SNPs per 10 bp. Nevertheless, it allowed quite a number of SNPs in a 100 bp read. For instance in the TM datasets, 56% of SNPs were less than 64 pb away. Since the UNEAK approach only allows 1 SNP per 64 pb, more than 50% of TM SNPs would be automatically discarded by the UNEAK pipeline.</p>
<p>However, the two pipelines differ strongly in their rare variant calling rates. Even if base quality is higher than 30 with the ILLUMINA sequencing platform, i.e., one error every 103 bases, with the amount of data that was generated, it ended up creating numerous errors. Calling rare variants (or not) will depend on the SNP and genotype calling algorithm implemented in the software and on how error sequencing rates are considered (<xref ref-type="bibr" rid="B31">Han et al., 2014</xref>). With some pipelines, the error rate estimate is considered to be constant across the genome, while other pipelines estimate an error rate for each base (<xref ref-type="bibr" rid="B34">Hohenlohe et al., 2011</xref>). Error rate estimates can also account for dependency between sequencing errors, (or not; <xref ref-type="bibr" rid="B31">Han et al., 2014</xref>). In GATK software, it is assumed that sequencing errors are independent and it takes coverage and base quality into consideration. Thus, unless coverage is about 10&#x00D7; per site per sample, GATK with UnifiedGenotyper can underestimate rare variants (<xref ref-type="bibr" rid="B31">Han et al., 2014</xref>), whereas UNEAK handles sequencing errors differently. To deal with this issue, the UNEAK pipeline uses a minimum ETR of 3% to call variants. This ETR has a direct impact on true low frequency variants: with erroneous SNP, true SNP are discarded. This way of handling the error sequencing rate might be the main reason why UNEAK SFS underestimates low frequency SNPs compared to the expected distribution with a population at equilibrium. It would also explain why so few SNPs are shared, since only frequent SNPs can be found by both datasets, which was confirmed by the distribution of MAF observed for shared SNPs.</p>
<p>In summary, we identified two main reasons for the low number of shared SNPs: (i) the constraint of no more than one SNP within 64 bp, and (ii) the uncovering of rare variants by UNEAK, which represent the majority of the polymorphism expected for a population at equilibrium. We ended up with relatively few shared SNPs. However, the allele frequency correlation between these SNP was very high.</p>
</sec>
<sec><title>Effect of Pipelines on Diversity Estimates</title>
<p>How the specific characteristics of the SNPs we identified will affect population genetics estimates is another important question. There is an increasing literature on how parameters such as the number of mismatches allowed to assemble reads in orthologous loci with RRL approaches will influence the number of SNPs identified and population results. Most available studies focus on the effect of Stacks pipeline parameters (<xref ref-type="bibr" rid="B8">Catchen et al., 2013</xref>; <xref ref-type="bibr" rid="B48">Mastretta-Yanes et al., 2014</xref>). For instance, allowing a small number of mismatches would lead to the creation of more loci than in real life, and conversely, allowing too many mismatches would lead to merging paralogs. Being too stringent can increase genotyping error rates (<xref ref-type="bibr" rid="B48">Mastretta-Yanes et al., 2014</xref>) and overestimate homozygosity (<xref ref-type="bibr" rid="B38">Ilut et al., 2014</xref>). This could also have an effect on the identification of population structure (<xref ref-type="bibr" rid="B33">Harvey et al., 2015</xref>; <xref ref-type="bibr" rid="B61">Rodr&#x00ED;guez-Ezpeleta et al., 2016</xref>). With the UNEAK pipeline only allowing 1 bp mismatch, it is the maximum stringency level for an RRL pipeline. Yet, we saw no effect on population structure and we observed very high congruence in the population structure in the two datasets and with two different methods: a Bayesian method and a PCA. These results are similar to those obtained by <xref ref-type="bibr" rid="B61">Rodr&#x00ED;guez-Ezpeleta et al. (2016)</xref>.</p>
<p>The main difference we observed between pipelines concerned the identification of low frequency variants. We found that the UNEAK pipeline was not able to recover rare variants while the SFS pattern for frequent variants was similar between pipelines. Thus methods based on &#x201C;more frequent alleles&#x201D; such as population structure approaches led to similar results. On the other hand, several statistics using low frequency variants differed considerably depending on the dataset used. Tajima&#x2019;s <italic>D</italic> test (<xref ref-type="bibr" rid="B67">Tajima, 1989</xref>) is based on the SFS pattern, where an excess of rare variants is the sign of a population expansion or positive selection and inversely, a reduction in rare variants is the sign of a population contraction or balancing selection. Both pipelines gave highly contrasted results ranging from an overall negative value signature of &#x2013;0.65 to a positive value signature of 2.74. Unbiased SFS is crucial for population genetics. Methods used to investigate population history including bottlenecks or expansion events are based on the difference between allelic diversity and heterozygosity and therefore depend on the identification of rare variants. Moreover, SFSs are widely used to test signatures of selection using Tajima&#x2019;s <italic>D</italic> but so are other tests such as the CLR test (<xref ref-type="bibr" rid="B51">Nielsen et al., 2009</xref>). With such tests based on SFS, calling pipelines might significantly affect genomic regions found to be under selection.</p>
<p>Differences in the number of rare variants detected will also influence F-statistics in addition to heterozygosities. <italic>F</italic><sub>ST</sub> is dependent on allele frequency, low <italic>F</italic><sub>ST</sub> is expected for low frequency variants. Consequently, integrating more rare variants ends up adding low <italic>F</italic><sub>ST</sub> value, and thus lowering the mean <italic>F</italic><sub>ST</sub> value. The first and most simple consequence will be to make it difficult to compare diversity estimates obtained with different pipelines, an important issue in comparative studies (<xref ref-type="bibr" rid="B38">Ilut et al., 2014</xref>; <xref ref-type="bibr" rid="B33">Harvey et al., 2015</xref>). Another very &#x201C;in vogue&#x201D; approach since the NGS area, is the <italic>F</italic><sub>ST</sub> outlier detection approach for discovery of genes under selection. A number of <italic>F</italic><sub>ST</sub> outlier tests have been developed and extensively used for the discovery of candidate genes (<xref ref-type="bibr" rid="B6">Beaumont and Nichols, 1996</xref>; <xref ref-type="bibr" rid="B70">Vitalis et al., 2001</xref>; <xref ref-type="bibr" rid="B5">Beaumont and Balding, 2004</xref>; <xref ref-type="bibr" rid="B19">Foll and Gaggiotti, 2008</xref>; <xref ref-type="bibr" rid="B7">Bonhomme et al., 2010</xref>; <xref ref-type="bibr" rid="B30">G&#x00FC;nther and Coop, 2013</xref>; <xref ref-type="bibr" rid="B15">Duforet-Frebourg et al., 2014</xref>) and are based on the expected distribution of <italic>F</italic><sub>ST</sub>. Underestimating rare variants will affect the overall distribution of <italic>F</italic><sub>ST</sub> and might therefore have an impact on these selection tests.</p>
<p>It is certain that some of the differences in the results of diversity estimates observed between datasets are due to the fact that the UNEAK pipeline interrogates coding and non-coding regions while the TM pipeline only interrogates coding regions. However, like other authors, we previously observed that using <italic>APEKI</italic> biased SNP discovery toward coding regions (<xref ref-type="bibr" rid="B10">Combosch and Vollmer, 2015</xref>; <xref ref-type="bibr" rid="B29">Guajardo et al., 2015</xref>). All in all, we believe that the bias in the diversity estimates is mainly the result of the properties of the pipelines. Biased SFS are the result of different parameters including error rate estimation formula and the stringency allowed for TAG merging. An increasing number of studies suggest that genotype calling might no longer be needed for NGS data (<xref ref-type="bibr" rid="B53">Nielsen et al., 2011</xref>; <xref ref-type="bibr" rid="B24">Fumagalli et al., 2013</xref>; <xref ref-type="bibr" rid="B31">Han et al., 2014</xref>). <xref ref-type="bibr" rid="B53">Nielsen et al. (2011)</xref> pointed out that, until now, no satisfactory genotype calling algorithm is available that would lead to an unbiased SFS. These authors proposed a direct approach implemented in ANGSD software (<xref ref-type="bibr" rid="B40">Korneliussen et al., 2014</xref>) that does not intend to call genotypes and this approach has been extended by a modified PCA (<xref ref-type="bibr" rid="B24">Fumagalli et al., 2013</xref>) and admixture estimate approach based on genotype likelihoods (<xref ref-type="bibr" rid="B64">Skotte et al., 2013</xref>). However, this software works only with BAM files as input and a reference. Given this limitation, SNPs obtained from the UNEAK pipeline could not be used, whereas our TM pipeline could integrate such analysis.</p>
</sec>
</sec>
<sec><title>Conclusion</title>
<p>We have demonstrated the possibilities and discussed the advantages and disadvantages of two pipelines used for SNP discovery when no genome reference is available. We found that the UNEAK pipeline, with little and simple bioinformatics work, can efficiently identify a large number of SNPs as well as highlight genetic clustering. However, we observed notable underestimation of rare variants that could impact the estimation of population genetics and the detection of selection. Therefore, we encourage researchers to pay more attention to SFS. The transcriptome mapping reference was less biased in that sense and, more importantly, such a strategy could be used in combination with ongoing approaches without genotype calling to further reduce bias on SFS. The alternative strategy has the further advantage of enabling access to sequences surrounding SNPs for further genomic exploration. Moreover, since few SNPs are shared, both datasets could be combined, thereby significantly increasing the SNPs used.</p>
</sec>
<sec><title>Author Contributions</title>
<p>CB-S, CM, and YV designed the project. CM, MC, and JP carried out the molecular laboratory work. CB-S and J-BF analyzed the data. CB-S and YV wrote the manuscript. All the authors discussed the results and commented on the manuscript.</p>
</sec>
<sec><title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This work was supported by grant N&#x00B0;ANR-12-PDOC-009-01 from the Agence National de la Recherche to CB-S, YV is supported by a grant N&#x00B0; ANR-13-BSV7-0017-01.</p></fn>
</fn-group>
<sec sec-type="supplementary material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="http://journal.frontiersin.org/article/10.3389/fpls.2016.00777">http://journal.frontiersin.org/article/10.3389/fpls.2016.00777</ext-link></p>
<supplementary-material xlink:href="Image_1.TIF" id="SM1" mimetype="image/tiff" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>FIGURE S1</label>
<caption><p><bold>Distribution of minor alleles frequencies (MAF) estimated by within the TM pipeline for shared SNPs</bold>.</p></caption>
</supplementary-material>
<supplementary-material xlink:href="Image_1.TIF" id="S1" mimetype="image/tiff" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_1.XLS" id="SM2" mimetype="application/vnd.ms-excel" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>TABLE S1</label>
<caption><p><bold>List of SNPs shared between both datasets</bold>.</p></caption>
</supplementary-material>
<supplementary-material xlink:href="Table_1.XLS" id="S2" mimetype="application/vnd.ms-excel" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Presentation_1.PDF" id="SM3" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alex Buerkle</surname> <given-names>C.</given-names></name> <name><surname>Gompert</surname> <given-names>Z.</given-names></name></person-group> (<year>2013</year>). <article-title>Population genomics based on low coverage sequencing: how low should we go?</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>22</volume> <fpage>3028</fpage>&#x2013;<lpage>3035</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12105</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Andrews</surname> <given-names>K. R.</given-names></name> <name><surname>Luikart</surname> <given-names>G.</given-names></name></person-group> (<year>2014</year>). <article-title>Recent novel approaches for population genomics data analysis.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>23</volume> <fpage>1661</fpage>&#x2013;<lpage>1667</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12686</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arnold</surname> <given-names>B.</given-names></name> <name><surname>Corbett-Detig</surname> <given-names>R. B.</given-names></name> <name><surname>Hartl</surname> <given-names>D.</given-names></name> <name><surname>Bomblies</surname> <given-names>K.</given-names></name></person-group> (<year>2013</year>). <article-title>RADseq underestimates diversity and introduces genealogical biases due to nonrandom haplotype sampling.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>22</volume> <fpage>3179</fpage>&#x2013;<lpage>3190</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12276</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baird</surname> <given-names>N. A.</given-names></name> <name><surname>Etter</surname> <given-names>P. D.</given-names></name> <name><surname>Atwood</surname> <given-names>T. S.</given-names></name> <name><surname>Currey</surname> <given-names>M. C.</given-names></name> <name><surname>Shiver</surname> <given-names>A. L.</given-names></name> <name><surname>Lewis</surname> <given-names>Z. A.</given-names></name><etal/></person-group> (<year>2008</year>). <article-title>Rapid SNP discovery and genetic mapping using sequenced RAD markers.</article-title> <source><italic>PLoS ONE</italic></source> <volume>3</volume>:<issue>e3376</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0003376</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beaumont</surname> <given-names>M. A.</given-names></name> <name><surname>Balding</surname> <given-names>D. J.</given-names></name></person-group> (<year>2004</year>). <article-title>Identifying adaptive genetic divergence among populations from genome scans.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>13</volume> <fpage>969</fpage>&#x2013;<lpage>980</lpage>. <pub-id pub-id-type="doi">10.1111/j.1365-294X.2004.02125.x</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beaumont</surname> <given-names>M. A.</given-names></name> <name><surname>Nichols</surname> <given-names>R. A.</given-names></name></person-group> (<year>1996</year>). <article-title>Evaluation loci for use in the genetic analysis of population structure.</article-title> <source><italic>Proc. R. Soc. B Biol. Sci.</italic></source> <volume>263</volume> <fpage>1619</fpage>&#x2013;<lpage>1626</lpage>. <pub-id pub-id-type="doi">10.1098/rspb.1996.0237</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bonhomme</surname> <given-names>M.</given-names></name> <name><surname>Chevalet</surname> <given-names>C.</given-names></name> <name><surname>Servin</surname> <given-names>B.</given-names></name> <name><surname>Boitard</surname> <given-names>S.</given-names></name> <name><surname>Abdallah</surname> <given-names>J. M.</given-names></name> <name><surname>Blott</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>Detecting selection in population trees: the Lewontin and Krakauer test extended.</article-title> <source><italic>Genetics</italic></source> <volume>186</volume> <fpage>241</fpage>&#x2013;<lpage>262</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.110.117275</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Catchen</surname> <given-names>J.</given-names></name> <name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name> <name><surname>Bassham</surname> <given-names>S.</given-names></name> <name><surname>Amores</surname> <given-names>A.</given-names></name> <name><surname>Cresko</surname> <given-names>W. A.</given-names></name></person-group> (<year>2013</year>). <article-title>Stacks: an analysis tool set for population genomics.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>22</volume> <fpage>3124</fpage>&#x2013;<lpage>3140</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12354</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Catchen</surname> <given-names>J. M.</given-names></name> <name><surname>Amores</surname> <given-names>A.</given-names></name> <name><surname>Hohenlohe</surname> <given-names>P.</given-names></name> <name><surname>Cresko</surname> <given-names>W.</given-names></name> <name><surname>Postlethwait</surname> <given-names>J. H.</given-names></name></person-group> (<year>2011</year>). <article-title>Stacks: building and genotyping loci de novo from short-read sequences.</article-title> <source><italic>G3 (Bethesda)</italic></source> <volume>1</volume> <fpage>171</fpage>&#x2013;<lpage>182</lpage>. <pub-id pub-id-type="doi">10.1534/g3.111.000240</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Combosch</surname> <given-names>D. J.</given-names></name> <name><surname>Vollmer</surname> <given-names>S. V.</given-names></name></person-group> (<year>2015</year>). <article-title>Trans-Pacific RAD-Seq population genomics confirms introgressive hybridization in Eastern Pacific <italic>Pocillopora</italic> corals.</article-title> <source><italic>Mol. Phylogenet. Evol.</italic></source> <volume>88</volume> <fpage>154</fpage>&#x2013;<lpage>162</lpage>. <pub-id pub-id-type="doi">10.1016/j.ympev.2015.03.022</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cronn</surname> <given-names>R.</given-names></name> <name><surname>Knaus</surname> <given-names>B. J.</given-names></name> <name><surname>Liston</surname> <given-names>A.</given-names></name> <name><surname>Maughan</surname> <given-names>P. J.</given-names></name> <name><surname>Parks</surname> <given-names>M.</given-names></name> <name><surname>Syring</surname> <given-names>J. V.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>Targeted enrichment strategies for next-generation plant biology.</article-title> <source><italic>Am. J. Bot.</italic></source> <volume>99</volume> <fpage>291</fpage>&#x2013;<lpage>311</lpage>. <pub-id pub-id-type="doi">10.3732/ajb.1100356</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Davey</surname> <given-names>J. W.</given-names></name> <name><surname>Cezard</surname> <given-names>T.</given-names></name> <name><surname>Fuentes-Utrilla</surname> <given-names>P.</given-names></name> <name><surname>Eland</surname> <given-names>C.</given-names></name> <name><surname>Gharbi</surname> <given-names>K.</given-names></name> <name><surname>Blaxter</surname> <given-names>M. L.</given-names></name></person-group> (<year>2013</year>). <article-title>Special features of RAD Sequencing data: implications for genotyping.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>22</volume> <fpage>3151</fpage>&#x2013;<lpage>3164</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12084</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Davey</surname> <given-names>J. W.</given-names></name> <name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name> <name><surname>Etter</surname> <given-names>P. D.</given-names></name> <name><surname>Boone</surname> <given-names>J. Q.</given-names></name> <name><surname>Catchen</surname> <given-names>J. M.</given-names></name> <name><surname>Blaxter</surname> <given-names>M. L.</given-names></name></person-group> (<year>2011</year>). <article-title>Genome-wide genetic marker discovery and genotyping using next-generation sequencing.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>12</volume> <fpage>499</fpage>&#x2013;<lpage>510</lpage>. <pub-id pub-id-type="doi">10.1038/nrg3012</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>DePristo</surname> <given-names>M. A.</given-names></name> <name><surname>Banks</surname> <given-names>E.</given-names></name> <name><surname>Poplin</surname> <given-names>R.</given-names></name> <name><surname>Garimella</surname> <given-names>K. V.</given-names></name> <name><surname>Maguire</surname> <given-names>J. R.</given-names></name> <name><surname>Hartl</surname> <given-names>C.</given-names></name><etal/></person-group> (<year>2011</year>). <article-title>A framework for variation discovery and genotyping using next-generation DNA sequencing data.</article-title> <source><italic>Nat. Genet.</italic></source> <volume>43</volume> <fpage>491</fpage>&#x2013;<lpage>498</lpage>. <pub-id pub-id-type="doi">10.1038/ng.806</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duforet-Frebourg</surname> <given-names>N.</given-names></name> <name><surname>Bazin</surname> <given-names>E.</given-names></name> <name><surname>Blum</surname> <given-names>M. G. B.</given-names></name></person-group> (<year>2014</year>). <article-title>Genome scans for detecting footprints of local adaptation using a Bayesian factor model.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>31</volume> <fpage>2483</fpage>&#x2013;<lpage>2495</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msu182</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ellegren</surname> <given-names>H.</given-names></name></person-group> (<year>2014</year>). <article-title>Genome sequencing and population genomics in non-model organisms.</article-title> <source><italic>Trends Ecol. Evol.</italic></source> <volume>29</volume> <fpage>51</fpage>&#x2013;<lpage>63</lpage>. <pub-id pub-id-type="doi">10.1016/j.tree.2013.09.008</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elshire</surname> <given-names>R. J.</given-names></name> <name><surname>Glaubitz</surname> <given-names>J. C.</given-names></name> <name><surname>Sun</surname> <given-names>Q.</given-names></name> <name><surname>Poland</surname> <given-names>J. A.</given-names></name> <name><surname>Kawamoto</surname> <given-names>K.</given-names></name> <name><surname>Buckler</surname> <given-names>E. S.</given-names></name><etal/></person-group> (<year>2011</year>). <article-title>A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species.</article-title> <source><italic>PLoS ONE</italic></source> <volume>6</volume>:<issue>e19379</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0019379</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Etter</surname> <given-names>P. D.</given-names></name> <name><surname>Bassham</surname> <given-names>S.</given-names></name> <name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name> <name><surname>Johnson</surname> <given-names>E. A.</given-names></name> <name><surname>Cresko</surname> <given-names>W. A.</given-names></name></person-group> (<year>2011</year>). &#x201C;<article-title>Molecular methods for evolutionary genetics</article-title>,&#x201D; in <source><italic>Methods in Molecular Biology</italic></source> <volume>Vol. 772</volume> <role>eds</role> <person-group person-group-type="editor"><name><surname>Orgogozo</surname> <given-names>V.</given-names></name> <name><surname>Rockman</surname> <given-names>M. V.</given-names></name></person-group> (<publisher-loc>Berlin:</publisher-loc> <publisher-name>Springer Science+Business Media</publisher-name>), <fpage>1</fpage>&#x2013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.1007/978-1-61779-228-1_1</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Foll</surname> <given-names>M.</given-names></name> <name><surname>Gaggiotti</surname> <given-names>O.</given-names></name></person-group> (<year>2008</year>). <article-title>A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective.</article-title> <source><italic>Genetics</italic></source> <volume>180</volume> <fpage>977</fpage>&#x2013;<lpage>993</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.108.092221</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Freedman</surname> <given-names>A. H.</given-names></name> <name><surname>Gronau</surname> <given-names>I.</given-names></name> <name><surname>Schweizer</surname> <given-names>R. M.</given-names></name> <name><surname>Ortega-Del Vecchyo</surname> <given-names>D.</given-names></name> <name><surname>Han</surname> <given-names>E.</given-names></name> <name><surname>Silva</surname> <given-names>P. M.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Genome sequencing highlights the dynamic early history of dogs.</article-title> <source><italic>PLoS Genet.</italic></source> <volume>10</volume>:<issue>e1004016</issue>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1004016</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Frichot</surname> <given-names>E.</given-names></name> <name><surname>Mathieu</surname> <given-names>F.</given-names></name> <name><surname>Trouillon</surname> <given-names>T.</given-names></name> <name><surname>Bouchard</surname> <given-names>G.</given-names></name> <name><surname>Fran&#x00E7;ois</surname> <given-names>O.</given-names></name></person-group> (<year>2014</year>). <article-title>Fast and efficient estimation of individual ancestry coefficients.</article-title> <source><italic>Genetics</italic></source> <volume>196</volume> <fpage>973</fpage>&#x2013;<lpage>983</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.113.160572</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>X.</given-names></name> <name><surname>Dou</surname> <given-names>J.</given-names></name> <name><surname>Mao</surname> <given-names>J.</given-names></name> <name><surname>Su</surname> <given-names>H.</given-names></name> <name><surname>Jiao</surname> <given-names>W.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>RADtyping: an integrated package for accurate de novo codominant and dominant RAD genotyping in mapping populations.</article-title> <source><italic>PLoS ONE</italic></source> <volume>8</volume>:<issue>e79960</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0079960</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>Y.-X.</given-names></name></person-group> (<year>1995</year>). <article-title>Statistical properties of segregating sites.</article-title> <source><italic>Theor. Popul. Biol.</italic></source> <volume>48</volume> <fpage>172</fpage>&#x2013;<lpage>197</lpage>. <pub-id pub-id-type="doi">10.1006/tpbi.1995.1025</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fumagalli</surname> <given-names>M.</given-names></name> <name><surname>Vieira</surname> <given-names>F. G.</given-names></name> <name><surname>Korneliussen</surname> <given-names>T. S.</given-names></name> <name><surname>Linderoth</surname> <given-names>T.</given-names></name> <name><surname>Huerta-S&#x00E1;nchez</surname> <given-names>E.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Quantifying population genetic differentiation from next-generation sequencing data.</article-title> <source><italic>Genetics</italic></source> <volume>195</volume> <fpage>979</fpage>&#x2013;<lpage>992</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.113.154740</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gautier</surname> <given-names>M.</given-names></name> <name><surname>Gharbi</surname> <given-names>K.</given-names></name> <name><surname>Cezard</surname> <given-names>T.</given-names></name> <name><surname>Foucaud</surname> <given-names>J.</given-names></name> <name><surname>Kerdelhu&#x00E9;</surname> <given-names>C.</given-names></name> <name><surname>Pudlo</surname> <given-names>P.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>The effect of RAD allele dropout on the estimation of genetic variation within and between populations.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>22</volume> <fpage>3165</fpage>&#x2013;<lpage>3178</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12089</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Glaubitz</surname> <given-names>J. C.</given-names></name> <name><surname>Casstevens</surname> <given-names>T. M.</given-names></name> <name><surname>Lu</surname> <given-names>F.</given-names></name> <name><surname>Harriman</surname> <given-names>J.</given-names></name> <name><surname>Elshire</surname> <given-names>R. J.</given-names></name> <name><surname>Sun</surname> <given-names>Q.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline.</article-title> <source><italic>PLoS ONE</italic></source> <volume>9</volume>:<issue>e90346</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0090346</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>G&#x00F3;ngora-Castillo</surname> <given-names>E.</given-names></name> <name><surname>Buell</surname> <given-names>C. R.</given-names></name></person-group> (<year>2013</year>). <article-title>Bioinformatics challenges in de novo transcriptome assembly using short read sequences in the absence of a reference genome sequence.</article-title> <source><italic>Nat. Prod. Rep.</italic></source> <volume>30</volume> <fpage>490</fpage>&#x2013;<lpage>500</lpage>. <pub-id pub-id-type="doi">10.1039/c3np20099j</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grabherr</surname> <given-names>M. G.</given-names></name> <name><surname>Haas</surname> <given-names>B. J.</given-names></name> <name><surname>Yassour</surname> <given-names>M.</given-names></name> <name><surname>Levin</surname> <given-names>J. Z.</given-names></name> <name><surname>Thompson</surname> <given-names>D. A.</given-names></name> <name><surname>Amit</surname> <given-names>I.</given-names></name><etal/></person-group> (<year>2011</year>). <article-title>Full-length transcriptome assembly from RNA-Seq data without a reference genome.</article-title> <source><italic>Nat. Biotechnol.</italic></source> <volume>29</volume> <fpage>644</fpage>&#x2013;<lpage>652</lpage>. <pub-id pub-id-type="doi">10.1038/nbt.1883</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guajardo</surname> <given-names>V.</given-names></name> <name><surname>Sol&#x00ED;s</surname> <given-names>S.</given-names></name> <name><surname>Sagredo</surname> <given-names>B.</given-names></name> <name><surname>Gainza</surname> <given-names>F.</given-names></name> <name><surname>Mu&#x00F1;oz</surname> <given-names>C.</given-names></name> <name><surname>Gasic</surname> <given-names>K.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Construction of high density sweet cherry (<italic>Prunus avium</italic> L.) linkage maps using microsatellite markers and SNPs detected by genotyping-by-sequencing (GBS).</article-title> <source><italic>PLoS ONE</italic></source> <volume>10</volume>:<issue>e0127750</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0127750</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>G&#x00FC;nther</surname> <given-names>T.</given-names></name> <name><surname>Coop</surname> <given-names>G.</given-names></name></person-group> (<year>2013</year>). <article-title>Robust identification of local adaptation from allele frequencies.</article-title> <source><italic>Genetics</italic></source> <volume>195</volume> <fpage>205</fpage>&#x2013;<lpage>220</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.113.152462</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Han</surname> <given-names>E.</given-names></name> <name><surname>Sinsheimer</surname> <given-names>J. S.</given-names></name> <name><surname>Novembre</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>Characterizing bias in population genetic inferences from low-coverage sequencing data.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>31</volume> <fpage>723</fpage>&#x2013;<lpage>735</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/mst229</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hancock</surname> <given-names>A. M.</given-names></name> <name><surname>Brachi</surname> <given-names>B.</given-names></name> <name><surname>Faure</surname> <given-names>N.</given-names></name> <name><surname>Horton</surname> <given-names>M. W.</given-names></name> <name><surname>Jarymowycz</surname> <given-names>L. B.</given-names></name> <name><surname>Sperone</surname> <given-names>F. G.</given-names></name><etal/></person-group> (<year>2011</year>). <article-title>Adaptation to climate across the <italic>Arabidopsis thaliana</italic> genome.</article-title> <source><italic>Science</italic></source> <volume>334</volume> <fpage>83</fpage>&#x2013;<lpage>86</lpage>. <pub-id pub-id-type="doi">10.1126/science.1209244</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Harvey</surname> <given-names>M. G.</given-names></name> <name><surname>Judy</surname> <given-names>C. D.</given-names></name> <name><surname>Seeholzer</surname> <given-names>G. F.</given-names></name> <name><surname>Maley</surname> <given-names>J. M.</given-names></name> <name><surname>Graves</surname> <given-names>G. R.</given-names></name> <name><surname>Brumfield</surname> <given-names>R. T.</given-names></name></person-group> (<year>2015</year>). <article-title>Similarity thresholds used in DNA sequence assembly from short reads can reduce the comparability of population histories across species.</article-title> <source><italic>PeerJ</italic></source> <volume>3</volume>:<issue>e895</issue>. <pub-id pub-id-type="doi">10.7717/peerj.895</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name> <name><surname>Amish</surname> <given-names>S. J.</given-names></name> <name><surname>Catchen</surname> <given-names>J. M.</given-names></name> <name><surname>Allendorf</surname> <given-names>F. W.</given-names></name> <name><surname>Luikart</surname> <given-names>G.</given-names></name></person-group> (<year>2011</year>). <article-title>Next-generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout.</article-title> <source><italic>Mol. Ecol. Resour.</italic></source> <volume>11</volume> <fpage>117</fpage>&#x2013;<lpage>122</lpage>. <pub-id pub-id-type="doi">10.1111/j.1755-0998.2010.02967.x</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name> <name><surname>Bassham</surname> <given-names>S.</given-names></name> <name><surname>Etter</surname> <given-names>P. D.</given-names></name> <name><surname>Sti&#xFB04;er</surname> <given-names>N.</given-names></name> <name><surname>Johnson</surname> <given-names>E. A.</given-names></name> <name><surname>Cresko</surname> <given-names>W. A.</given-names></name></person-group> (<year>2010</year>). <article-title>Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags.</article-title> <source><italic>PLoS Genet.</italic></source> <volume>6</volume>:<issue>e1000862</issue>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1000862</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name> <name><surname>Catchen</surname> <given-names>J.</given-names></name> <name><surname>Cresko</surname> <given-names>W. A.</given-names></name></person-group> (<year>2012</year>). &#x201C;<article-title>Population genomic analysis of model and nonmodel organisms using sequenced RAD tags</article-title>,&#x201D; in <source><italic>Data Production and Analysis in Population Genomics</italic></source>, <role>ed.</role> <person-group person-group-type="editor"><name><surname>Clifton</surname> <given-names>N. J.</given-names></name></person-group> (<publisher-loc>Berlin:</publisher-loc> <publisher-name>Springer</publisher-name>), <fpage>235</fpage>&#x2013;<lpage>260</lpage>.</citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name> <name><surname>Day</surname> <given-names>M. D.</given-names></name> <name><surname>Amish</surname> <given-names>S. J.</given-names></name> <name><surname>Miller</surname> <given-names>M. R.</given-names></name> <name><surname>Kamps-Hughes</surname> <given-names>N.</given-names></name> <name><surname>Boyer</surname> <given-names>M. C.</given-names></name></person-group> (<year>2013</year>). <article-title>Genomic patterns of introgression in rainbow and westslope cutthroat trout illuminated by overlapping paired-end RAD sequencing.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>22</volume> <fpage>3002</fpage>&#x2013;<lpage>3013</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12239</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ilut</surname> <given-names>D. C.</given-names></name> <name><surname>Nydam</surname> <given-names>M. L.</given-names></name> <name><surname>Hare</surname> <given-names>M. P.</given-names></name></person-group> (<year>2014</year>). <article-title>Defining loci in restriction-based reduced representation genomic data from nonmodel species: sources of bias and diagnostics for optimal clustering.</article-title> <source><italic>Biomed. Res. Int.</italic></source> <volume>2014</volume> <issue>675158</issue>. <pub-id pub-id-type="doi">10.1155/2014/675158</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jombart</surname> <given-names>T.</given-names></name> <name><surname>Ahmed</surname> <given-names>I.</given-names></name></person-group> (<year>2011</year>). <article-title>adegenet 1.3-1: new tools for the analysis of genome-wide SNP data.</article-title> <source><italic>Bioinformatics</italic></source> <volume>27</volume> <fpage>3070</fpage>&#x2013;<lpage>3071</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btr521</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Korneliussen</surname> <given-names>T. S.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name> <name><surname>Nielsen</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). <article-title>Open access ANGSD: analysis of next generation sequencing data.</article-title> <source><italic>BMC Bioinform.</italic></source> <volume>15</volume>:<issue>356</issue>. <pub-id pub-id-type="doi">10.1186/s12859-014-0356-4</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Durbin</surname> <given-names>R.</given-names></name></person-group> (<year>2009</year>). <article-title>Fast and accurate short read alignment with Burrows&#x2013;Wheeler transform.</article-title> <source><italic>Bioinformatics</italic></source> <volume>25</volume> <fpage>1754</fpage>&#x2013;<lpage>1760</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp324</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Handsaker</surname> <given-names>B.</given-names></name> <name><surname>Wysoker</surname> <given-names>A.</given-names></name> <name><surname>Fennell</surname> <given-names>T.</given-names></name> <name><surname>Ruan</surname> <given-names>J.</given-names></name> <name><surname>Homer</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2009</year>). <article-title>The sequence alignment/map format and SAMtools.</article-title> <source><italic>Bioinformatics</italic></source> <volume>25</volume> <fpage>2078</fpage>&#x2013;<lpage>2079</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp352</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lu</surname> <given-names>F.</given-names></name> <name><surname>Lipka</surname> <given-names>A. E.</given-names></name> <name><surname>Glaubitz</surname> <given-names>J.</given-names></name> <name><surname>Elshire</surname> <given-names>R.</given-names></name> <name><surname>Cherney</surname> <given-names>J. H.</given-names></name> <name><surname>Casler</surname> <given-names>M. D.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Switchgrass genomic diversity, ploidy, and evolution: novel insights from a network-based SNP discovery protocol.</article-title> <source><italic>PLoS Genet.</italic></source> <volume>9</volume>:<issue>e1003215</issue>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1003215</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mardis</surname> <given-names>E. R.</given-names></name></person-group> (<year>2008</year>). <article-title>Next-generation DNA sequencing methods.</article-title> <source><italic>Annu. Rev. Genomics Hum. Genet.</italic></source> <volume>9</volume> <fpage>387</fpage>&#x2013;<lpage>402</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.genom.9.081307.164359</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mariac</surname> <given-names>C.</given-names></name> <name><surname>Luong</surname> <given-names>V.</given-names></name> <name><surname>Kapran</surname> <given-names>I.</given-names></name> <name><surname>Mamadou</surname> <given-names>A.</given-names></name> <name><surname>Sagnard</surname> <given-names>F.</given-names></name> <name><surname>Deu</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2006</year>). <article-title>Diversity of wild and cultivated pearl millet accessions (<italic>Pennisetum glaucum</italic> [L.] R. Br.) in Niger assessed by microsatellite markers.</article-title> <source><italic>Theor. Appl. Genet.</italic></source> <volume>114</volume> <fpage>49</fpage>&#x2013;<lpage>58</lpage>. <pub-id pub-id-type="doi">10.1007/s00122-006-0409-9</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martin</surname> <given-names>J. A.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name></person-group> (<year>2011</year>). <article-title>Next-generation transcriptome assembly.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>12</volume> <fpage>671</fpage>&#x2013;<lpage>682</lpage>. <pub-id pub-id-type="doi">10.1038/nrg3068</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martin</surname> <given-names>M.</given-names></name></person-group> (<year>2011</year>). <article-title>Cutadapt removes adapter sequences from high-throughput sequencing reads.</article-title> <source><italic>EMBnet. J.</italic></source> <volume>17</volume> <fpage>10</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.14806/ej.17.1.200</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mastretta-Yanes</surname> <given-names>A.</given-names></name> <name><surname>Arrigo</surname> <given-names>N.</given-names></name> <name><surname>Alvarez</surname> <given-names>N.</given-names></name> <name><surname>Jorgensen</surname> <given-names>T. H.</given-names></name> <name><surname>Pi&#x00F1;ero</surname> <given-names>D.</given-names></name> <name><surname>Emerson</surname> <given-names>B. C.</given-names></name></person-group> (<year>2014</year>). <article-title>Restriction site-associated DNA sequencing, genotyping error estimation and de novo assembly optimization for population genetic inference.</article-title> <source><italic>Mol. Ecol. Resour.</italic></source> <volume>15</volume> <fpage>28</fpage>&#x2013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1111/1755-0998.12291</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moumouni</surname> <given-names>K. H.</given-names></name> <name><surname>Kountche</surname> <given-names>B. A.</given-names></name> <name><surname>Jean</surname> <given-names>M.</given-names></name> <name><surname>Hash</surname> <given-names>C. T.</given-names></name> <name><surname>Vigouroux</surname> <given-names>Y.</given-names></name> <name><surname>Haussmann</surname> <given-names>B. I. G.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Construction of a genetic map for pearl millet, <italic>Pennisetum glaucum</italic> (L.) R. Br., using a genotyping-by-sequencing (GBS) approach.</article-title> <source><italic>Mol. Breed.</italic></source> <volume>35</volume> <issue>5</issue>. <pub-id pub-id-type="doi">10.1007/s11032-015-0212-x</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Narum</surname> <given-names>S. R.</given-names></name> <name><surname>Buerkle</surname> <given-names>C. A.</given-names></name> <name><surname>Davey</surname> <given-names>J. W.</given-names></name> <name><surname>Miller</surname> <given-names>M. R.</given-names></name> <name><surname>Hohenlohe</surname> <given-names>P. A.</given-names></name></person-group> (<year>2013</year>). <article-title>Genotyping-by-sequencing in ecological and conservation genomics.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>22</volume> <fpage>2841</fpage>&#x2013;<lpage>2847</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12350</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nielsen</surname> <given-names>E. E.</given-names></name> <name><surname>Hemmer-Hansen</surname> <given-names>J.</given-names></name> <name><surname>Poulsen</surname> <given-names>N. A.</given-names></name> <name><surname>Loeschcke</surname> <given-names>V.</given-names></name> <name><surname>Moen</surname> <given-names>T.</given-names></name> <name><surname>Johansen</surname> <given-names>T.</given-names></name><etal/></person-group> (<year>2009</year>). <article-title>Genomic signatures of local directional selection in a high gene flow marine organism; the Atlantic cod (<italic>Gadus morhua</italic>).</article-title> <source><italic>BMC Evol. Biol.</italic></source> <volume>9</volume>:<issue>276</issue>. <pub-id pub-id-type="doi">10.1186/1471-2148-9-276</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nielsen</surname> <given-names>R.</given-names></name> <name><surname>Korneliussen</surname> <given-names>T.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name></person-group> (<year>2012</year>). <article-title>SNP calling, genotype calling, and sample allele frequency estimation from new-generation sequencing data.</article-title> <source><italic>PLoS ONE</italic></source> <volume>7</volume>:<issue>e37558</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0037558</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nielsen</surname> <given-names>R.</given-names></name> <name><surname>Paul</surname> <given-names>J. S.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name> <name><surname>Song</surname> <given-names>Y. S.</given-names></name></person-group> (<year>2011</year>). <article-title>Genotype and SNP calling from next-generation sequencing data.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>12</volume> <fpage>443</fpage>&#x2013;<lpage>451</lpage>. <pub-id pub-id-type="doi">10.1038/nrg2986</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paradis</surname> <given-names>E.</given-names></name></person-group> (<year>2010</year>). <article-title>pegas: an R package for population genetics with an integrated-modular approach.</article-title> <source><italic>Bioinformatics</italic></source> <volume>26</volume> <fpage>419</fpage>&#x2013;<lpage>420</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp696</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Patterson</surname> <given-names>N.</given-names></name> <name><surname>Price</surname> <given-names>A. L.</given-names></name> <name><surname>Reich</surname> <given-names>D.</given-names></name></person-group> (<year>2006</year>). <article-title>Population structure and eigenanalysis.</article-title> <source><italic>PLoS Genet.</italic></source> <volume>2</volume>:<issue>e190</issue>. <pub-id pub-id-type="doi">10.1371/journal.pgen.0020190</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phillips</surname> <given-names>T.</given-names></name></person-group> (<year>2008</year>). <article-title>The role of methylation in gene expression.</article-title> <source><italic>Nat. Educ.</italic></source> <volume>1</volume> <issue>116</issue>.</citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Price</surname> <given-names>A. L.</given-names></name> <name><surname>Patterson</surname> <given-names>N. J.</given-names></name> <name><surname>Plenge</surname> <given-names>R. M.</given-names></name> <name><surname>Weinblatt</surname> <given-names>M. E.</given-names></name> <name><surname>Shadick</surname> <given-names>N. A.</given-names></name> <name><surname>Reich</surname> <given-names>D.</given-names></name></person-group> (<year>2006</year>). <article-title>Principal components analysis corrects for stratification in genome-wide association studies.</article-title> <source><italic>Nat. Genet.</italic></source> <volume>38</volume> <fpage>904</fpage>&#x2013;<lpage>909</lpage>. <pub-id pub-id-type="doi">10.1038/ng1847</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pritchard</surname> <given-names>J. K.</given-names></name> <name><surname>Stephens</surname> <given-names>M.</given-names></name> <name><surname>Donnelly</surname> <given-names>P.</given-names></name></person-group> (<year>2000</year>). <article-title>Inference of population structure using multilocus genotype data.</article-title> <source><italic>Genetics</italic></source> <volume>155</volume> <fpage>945</fpage>&#x2013;<lpage>959</lpage>.</citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pujolar</surname> <given-names>J. M.</given-names></name> <name><surname>Jacobsen</surname> <given-names>M. W.</given-names></name> <name><surname>Als</surname> <given-names>T. D.</given-names></name> <name><surname>Frydenberg</surname> <given-names>J.</given-names></name> <name><surname>Munch</surname> <given-names>K.</given-names></name> <name><surname>J&#x00F3;nsson</surname> <given-names>B.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Genome-wide single-generation signatures of local selection in the panmictic European eel.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>23</volume> <fpage>2514</fpage>&#x2013;<lpage>2528</lpage>. <pub-id pub-id-type="doi">10.1111/mec.12753</pub-id></citation></ref>
<ref id="B60"><citation citation-type="journal"><collab>R Core Team</collab> (<year>2015</year>). <source><italic>R: A Language and Environment for Statistical Computing</italic>.</source> Available at: <ext-link ext-link-type="uri" xlink:href="https://www.r-project.org">https://www.r-project.org</ext-link></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rodr&#x00ED;guez-Ezpeleta</surname> <given-names>N.</given-names></name> <name><surname>Bradbury</surname> <given-names>I. R.</given-names></name> <name><surname>Mendibil</surname> <given-names>I.</given-names></name> <name><surname>&#x00C1;lvarez</surname> <given-names>P.</given-names></name> <name><surname>Cotano</surname> <given-names>U.</given-names></name> <name><surname>Irigoien</surname> <given-names>X.</given-names></name></person-group> (<year>2016</year>). <article-title>Population structure of Atlantic Mackerel inferred from RAD-seq derived SNP markers: effects of sequence clustering parameters and hierarchical SNP selection.</article-title> <source><italic>Mol. Ecol. Resour.</italic></source> <pub-id pub-id-type="doi">10.1111/1755-0998.12518</pub-id> [<comment>Epub ahead of print</comment>].</citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Russell</surname> <given-names>J.</given-names></name> <name><surname>Hackett</surname> <given-names>C.</given-names></name> <name><surname>Hedley</surname> <given-names>P.</given-names></name> <name><surname>Liu</surname> <given-names>H.</given-names></name> <name><surname>Milne</surname> <given-names>L.</given-names></name> <name><surname>Bayer</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>The use of genotyping by sequencing in blackcurrant (<italic>Ribes nigrum</italic>): developing high-resolution linkage maps in species without reference genome sequences.</article-title> <source><italic>Mol. Breed.</italic></source> <volume>33</volume> <fpage>835</fpage>&#x2013;<lpage>849</lpage>. <pub-id pub-id-type="doi">10.1007/s11032-013-9996-8</pub-id></citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schilling</surname> <given-names>M. P.</given-names></name> <name><surname>Wolf</surname> <given-names>P. G.</given-names></name> <name><surname>Duffy</surname> <given-names>A. M.</given-names></name> <name><surname>Rai</surname> <given-names>H. S.</given-names></name> <name><surname>Rowe</surname> <given-names>C. A.</given-names></name> <name><surname>Richardson</surname> <given-names>B. A.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Genotyping-by-sequencing for populus population genomics: an assessment of genome sampling patterns and filtering approaches.</article-title> <source><italic>PLoS ONE</italic></source> <volume>9</volume>:<issue>e95292</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0095292</pub-id></citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Skotte</surname> <given-names>L.</given-names></name> <name><surname>Korneliussen</surname> <given-names>T. S.</given-names></name> <name><surname>Albrechtsen</surname> <given-names>A.</given-names></name></person-group> (<year>2013</year>). <article-title>Estimating individual admixture proportions from next generation sequencing data.</article-title> <source><italic>Genetics</italic></source> <volume>195</volume> <fpage>693</fpage>&#x2013;<lpage>702</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.113.154138</pub-id></citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sonah</surname> <given-names>H.</given-names></name> <name><surname>Bastien</surname> <given-names>M.</given-names></name> <name><surname>Iquira</surname> <given-names>E.</given-names></name> <name><surname>Tardivel</surname> <given-names>A.</given-names></name> <name><surname>L&#x00E9;gar&#x00E9;</surname> <given-names>G.</given-names></name> <name><surname>Boyle</surname> <given-names>B.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>An improved genotyping by sequencing (GBS) approach offering increased versatility and efficiency of SNP discovery and genotyping.</article-title> <source><italic>PLoS ONE</italic></source> <volume>8</volume>:<issue>e54603</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0054603</pub-id></citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sonah</surname> <given-names>H.</given-names></name> <name><surname>O&#x2019;Donoughue</surname> <given-names>L.</given-names></name> <name><surname>Cober</surname> <given-names>E.</given-names></name> <name><surname>Rajcan</surname> <given-names>I.</given-names></name> <name><surname>Belzile</surname> <given-names>F.</given-names></name></person-group> (<year>2015</year>). <article-title>Identification of loci governing eight agronomic traits using a GBS-GWAS approach and validation by QTL mapping in soya bean.</article-title> <source><italic>Plant Biotechnol. J.</italic></source> <volume>13</volume> <fpage>211</fpage>&#x2013;<lpage>221</lpage>. <pub-id pub-id-type="doi">10.1111/pbi.12249</pub-id></citation></ref>
<ref id="B67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tajima</surname> <given-names>F.</given-names></name></person-group> (<year>1989</year>). <article-title>Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.</article-title> <source><italic>Genetics</italic></source> <volume>123</volume> <fpage>585</fpage>&#x2013;<lpage>595</lpage>.</citation></ref>
<ref id="B68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tin</surname> <given-names>M. M. Y.</given-names></name> <name><surname>Rheindt</surname> <given-names>F. E.</given-names></name> <name><surname>Cros</surname> <given-names>E.</given-names></name> <name><surname>Mikheyev</surname> <given-names>A. S.</given-names></name></person-group> (<year>2015</year>). <article-title>Degenerate adaptor sequences for detecting PCR duplicates in reduced representation sequencing data improve genotype calling accuracy.</article-title> <source><italic>Mol. Ecol. Resour.</italic></source> <volume>15</volume> <fpage>329</fpage>&#x2013;<lpage>336</lpage>. <pub-id pub-id-type="doi">10.1111/1755-0998.12314</pub-id></citation></ref>
<ref id="B69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Upadhyaya</surname> <given-names>H. D.</given-names></name> <name><surname>Bajaj</surname> <given-names>D.</given-names></name> <name><surname>Das</surname> <given-names>S.</given-names></name> <name><surname>Saxena</surname> <given-names>M. S.</given-names></name> <name><surname>Badoni</surname> <given-names>S.</given-names></name> <name><surname>Kumar</surname> <given-names>V.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>A genome-scale integrated approach aids in genetic dissection of complex flowering time trait in chickpea.</article-title> <source><italic>Plant Mol. Biol.</italic></source> <volume>89</volume> <fpage>403</fpage>&#x2013;<lpage>420</lpage>. <pub-id pub-id-type="doi">10.1007/s11103-015-0377-z</pub-id></citation></ref>
<ref id="B70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vitalis</surname> <given-names>R.</given-names></name> <name><surname>Dawson</surname> <given-names>K.</given-names></name> <name><surname>Boursot</surname> <given-names>P.</given-names></name></person-group> (<year>2001</year>). <article-title>Interpretation of variation across marker loci as evidence of selection.</article-title> <source><italic>Genetics</italic></source> <volume>158</volume> <fpage>1811</fpage>&#x2013;<lpage>1823</lpage>.</citation></ref>
<ref id="B71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zheng</surname> <given-names>X.</given-names></name> <name><surname>Levine</surname> <given-names>D.</given-names></name> <name><surname>Shen</surname> <given-names>J.</given-names></name> <name><surname>Gogarten</surname> <given-names>S. M.</given-names></name> <name><surname>Laurie</surname> <given-names>C.</given-names></name> <name><surname>Weir</surname> <given-names>B. S.</given-names></name></person-group> (<year>2012</year>). <article-title>A high-performance computing toolset for relatedness and principal component analysis of SNP data.</article-title> <source><italic>Bioinformatics</italic></source> <volume>28</volume> <fpage>3326</fpage>&#x2013;<lpage>3328</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts606</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="fn01"><label>1</label><p><ext-link ext-link-type="uri" xlink:href="http://www.biotech.cornell.edu/brc/genomic-diversity">http://www.biotech.cornell.edu/brc/genomic-diversity</ext-link></p></fn>
<fn id="fn02"><label>2</label><p><ext-link ext-link-type="uri" xlink:href="https://sites.google.com/site/africropproject/data">https://sites.google.com/site/africropproject/data</ext-link></p></fn>
<fn id="fn03"><label>3</label><p><ext-link ext-link-type="uri" xlink:href="http://www.R-project.org">http://www.R-project.org</ext-link></p></fn>
</fn-group>
</back>
</article>