AUTHOR=Korkuć Paula , Arends Danny , Brockmann Gudrun A. TITLE=Finding the Optimal Imputation Strategy for Small Cattle Populations JOURNAL=Frontiers in Genetics VOLUME=Volume 10 - 2019 YEAR=2019 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2019.00052 DOI=10.3389/fgene.2019.00052 ISSN=1664-8021 ABSTRACT=The imputation from lower density SNP chip genotypes to whole-genome sequence level is an established approach to generate high density genotypes for many individuals. Imputation accuracy is dependent on many factors and for small cattle populations such as the endangered German Black Pied cattle (DSN), determining the optimal imputation strategy is especially challenging since only a low number of high density genotypes is available. In this paper, the accuracy of imputation was explored with regard to 1) phasing of the target population and the reference panel for imputation, 2) comparison of a 1-step imputation approach, where 50k genotypes are directly imputed to sequence level, to a 2-step imputation approach that used an intermediate step imputing first to 700k and subsequently to sequence level, 3) the software tools Beagle and Minimac, and 4) the size and composition of the reference panel for imputation. Analyses were performed for 30 DSN and 30 Holstein Frisian cattle available from the 1000 Bull Genomes Project. Imputation accuracy was assessed using a leave-one-out cross validation procedure. We observed that phasing of the target populations and the reference panels affects the imputation accuracy significantly. Minimac reached higher accuracy when imputing using small reference panels, while Beagle performed better with larger reference panels. In contrast to previous research, we found that when a low number of animals is available at the intermediate imputation step, the 1-step imputation approach yielded higher imputation accuracy compared to a 2-step imputation. Overall, the size of the reference panel for imputation is the most important factor leading to higher imputation accuracy, although using a larger reference panel consisting of breeds different from the target population can significantly reduce the accuracy. Our findings provide specific recommendations on how to obtain better imputation results, especially for populations with a limited number of high density genotyped or sequenced animals available such as DSN.