AUTHOR=Díaz-Arce Natalia , Rodríguez-Ezpeleta Naiara TITLE=Selecting RAD-Seq Data Analysis Parameters for Population Genetics: The More the Better? JOURNAL=Frontiers in Genetics VOLUME=Volume 10 - 2019 YEAR=2019 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2019.00533 DOI=10.3389/fgene.2019.00533 ISSN=1664-8021 ABSTRACT=Restriction site-associated DNA sequencing (RAD-seq) has become a powerful and widely used tool in molecular ecology studies as it allows to cost-effectively recover thousands of polymorphic sites across individuals of non-model organisms. However, the success of the technique is subordinated to a correct data processing that reduces potential loci assembly biases and thus, genotyping error rates. RAD-seq data processing when no reference genome is available involves the assembly of hundreds of thousands high-throughput sequencing reads into orthologous loci, for which various key parameter values need to be selected by the researcher. Previous studies exploring the effect of these parameter values found or assumed that a larger number of recovered polymorphic loci is associated with a better assembly. Here, using three RAD-seq datasets from different species, we explore the effect of read filtering, loci assembly and polymorphic site selection on number of markers obtained and genetic differentiation inferred. We find i) that recovery of higher numbers of polymorphic loci is not necessarily associated with higher genetic differentiation, ii) that presence of PCR duplicates, selected loci assembly parameters and selected SNP filtering parameters affect the number of recovered polymorphic loci and degree of genetic differentiation, and iii) that this effect is different in each dataset, meaning that defining a systematic universal protocol for RAD-seq data analysis may lead to missing population differentiation relevant information.