Whole-Genome Sequencing of Three Native Cattle Breeds Originating From the Northernmost Cattle Farming Regions

Northern Fennoscandia and the Sakha Republic in the Russian Federation represent the northernmost regions on Earth where cattle farming has been traditionally practiced. In this study, we performed whole-genome sequencing to genetically characterize three rare native breeds Eastern Finncattle, Western Finncattle and Yakutian cattle adapted to these northern Eurasian regions. We examined the demographic history, genetic diversity and unfolded loci under natural or artificial selection. On average, we achieved 13.01-fold genome coverage after mapping the sequencing reads on the bovine reference genome (UMD 3.1) and detected a total of 17.45 million single nucleotide polymorphisms (SNPs) and 1.95 million insertions-deletions (indels). We observed that the ancestral species (Bos primigenius) of Eurasian taurine cattle experienced two notable prehistorical declines in effective population size associated with dramatic climate changes. The modern Yakutian cattle exhibited a higher level of within-population variation in terms of number of SNPs and nucleotide diversity than the contemporary European taurine breeds. This result is in contrast to the results of marker-based cattle breed diversity studies, indicating assortment bias in previous analyses. Our results suggest that the effective population size of the ancestral Asiatic taurine cattle may have been higher than that of the European cattle. Alternatively, our findings could indicate the hybrid origins of the Yakutian cattle ancestries and possibly the lack of intensive artificial selection. We identified a number of genomic regions under selection that may have contributed to the adaptation to the northern and subarctic environments, including genes involved in disease resistance, sensory perception, cold adaptation and growth. By characterizing the native breeds, we were able to obtain new information on cattle genomes and on the value of the adapted breeds for the conservation of cattle genetic resources.

Northern Fennoscandia and the Sakha Republic in the Russian Federation represent the northernmost regions on Earth where cattle farming has been traditionally practiced. In this study, we performed whole-genome sequencing to genetically characterize three rare native breeds Eastern Finncattle, Western Finncattle and Yakutian cattle adapted to these northern Eurasian regions. We examined the demographic history, genetic diversity and unfolded loci under natural or artificial selection. On average, we achieved 13.01-fold genome coverage after mapping the sequencing reads on the bovine reference genome (UMD 3.1) and detected a total of 17.45 million single nucleotide polymorphisms (SNPs) and 1.95 million insertions-deletions (indels). We observed that the ancestral species (Bos primigenius) of Eurasian taurine cattle experienced two notable prehistorical declines in effective population size associated with dramatic climate changes. The modern Yakutian cattle exhibited a higher level of within-population variation in terms of number of SNPs and nucleotide diversity than the contemporary European taurine breeds. This result is in contrast to the results of markerbased cattle breed diversity studies, indicating assortment bias in previous analyses. Our results suggest that the effective population size of the ancestral Asiatic taurine cattle may have been higher than that of the European cattle. Alternatively, our findings could indicate the hybrid origins of the Yakutian cattle ancestries and possibly the lack of intensive artificial selection. We identified a number of genomic regions under selection that may have contributed to the adaptation to the northern and subarctic environments, including genes involved in disease resistance, sensory perception, cold adaptation and growth. By characterizing the native breeds, we were able to obtain new information on cattle genomes and on the value of the adapted breeds for the conservation of cattle genetic resources.

INTRODUCTION
During their 8,000-10,000 years of domestication, taurine cattle (Bos taurus) have adapted to a wide variety of biogeographic zones and sociocultural environments as a result of natural and human-derived selection (Felius, 1995). Fennoscandia along with northwestern Russia and the region of Sakha (Yakutia) in eastern Siberia, are the northernmost territories where cattle farming has had a relatively long tradition as the livelihood of local people (Kopoteva and Partanen, 2009;Bläuer and Kantanen, 2013;Cramp et al., 2014;Egorov et al., 2015). In prehistoric and historic times, animal husbandry faced several challenges in these northern climatic conditions, such as short summers and limited vegetation resources for feeding during the long winters, and this practice required well-adapted animals that were suited to the available environmental resources and socioeconomic and cultural conditions (Kantanen et al., 2009a;Bläuer and Kantanen, 2013;Egorov et al., 2015).
Cattle breeds such as Eastern Finncattle, Icelandic cattle, Swedish Mountain cattle, Yakutian cattle and other northern native cattle breeds are assumed to have their origins in the near-eastern domesticated taurine cattle that once spread to these northern regions (Kantanen et al., 2000(Kantanen et al., , 2009aLi et al., 2007). Herd books, pedigree registers and breeding associations were established in the late 19th and early 20th centuries. Early native breeds had a pivotal socioeconomic role in dairy and beef production in the northern Eurasian regions but have been almost exclusively replaced by commercial international cattle populations bred for high-input, high-output farming systems. Exceptions to this trend are Yakutian cattle in Siberia and Icelandic cattle, which continue to have high regional importance in food production (Kantanen et al., 2000(Kantanen et al., , 2009a. The conservation of the genetic resources of native, typically lowprofit breeds is often motivated by the fact that these breeds may possess valuable genetic variations for future animal breeding and to address the challenges that animal production will face during adaptation to future conditions, brought about by factors such as climate change (Odegård et al., 2009;Boettcher et al., 2010;Kantanen et al., 2015). In addition, breeds such as Yakutian cattle exhibit adaptation in demanding environments and may be extremely useful for enabling animal production in marginal regions (Kantanen et al., 2015).
Previous studies on the characterization of cattle genetic resources in northern Eurasian breeds have used various methods to study within-breed genetic diversity, population structure, demographic factors and interbreed relationships, e.g., autosomal and Y-chromosomal microsatellites, mitochondrial D-loop and whole-genome SNP-marker scans (Li et al., 2007;Kantanen et al., 2009b;Iso-Touru et al., 2016). These studies have indicated, for example, the genetic distinctiveness of the native northern European cattle breeds (e.g., the Finnish native breeds and Yakutian cattle) from modern commercial dairy breeds (such as the Finnish Ayrshire and Holstein breeds). In addition, a whole-genome SNP genotyping analysis detected genomic regions targeted by selection, which, for example, contain immune and environmental adaptation related genes (Iso-Touru et al., 2016;Yurchenko et al., 2018). Whole-genome sequencing (WGS)-based approaches provide additional possibilities for investigation of the genetic diversity of livestock breeds adapted to various biogeographic regions and production environments. Moreover, recent advancements in bioinformatics and statistical tools have enhanced our understanding of the demographic evolution of domestic animal species, the possible role of genomic structural variations in the adaptation of livestock breeds in the course of domestication and selection and the biological functions of these genomic variations (Gutenkunst et al., 2009;Li and Durbin, 2011;Alachiotis et al., 2012;Pavlidis et al., 2013;Wang G.-D. et al., 2014;Wang M. et al., 2014;Librado et al., 2015).
To expand our knowledge of genomic variations in northern Eurasian taurine cattle, we performed whole-genome sequencing of five animals from each of three northern native breeds, namely, Eastern Finncattle, Western Finncattle, and Yakutian cattle (Figure 1). We examined the genetic diversity and population structures of the breeds and identified chromosomal regions and genes under selection pressure. We also studied the demographic history of the northern Eurasian taurine cattle by using the whole-genome sequence data.

Ethics Statement
Blood samples of animals for DNA extraction were collected by using a protocol approved by the Animal Experiment Board of MTT Agrifood Research Finland (currently the Natural Resources Institute Finland, Luke) and the Board of Agricultural Office of Eveno-Bytantaj Region, Sakkyryr, Sakha, Russia.

DNA Sample Preparation and Sequencing
DNA extracted from blood samples was available for the two Finnish cattle breeds (Eastern Finncattle and Western Finncattle) and one Siberian breed (Yakutian cattle) from a previous study (Li et al., 2007). Five unrelated individuals from each breed (14 females and one Yakutian cattle bull) were examined. Genomic DNA was extracted using a standard phenol/chloroform-based protocol (Malke, 1990). For sequencing library preparation following the manufacturer's specifications, the genomic DNA of each individual was fragmented randomly. After electrophoresis, DNA fragments of desired length were gel purified. One type of library was constructed for each sample (500 bp insert size); 15 paired-end DNA libraries were constructed for the 15 samples. Adapter ligation and DNA cluster preparation were performed, and the DNA was subjected to Illumina HiSeq 2000 sequencing using the 2 × 100 bp mode at Beijing Genomics Institute (BGI, Shenzhen, China). Finally, paired-end sequence data were generated. To ensure quality, the raw data was modified by the following two steps using SOAPnuke (Chen et al., 2018a,b): first, the contaminating adapter sequences from the reads were deleted, and then, the reads that contained more than 50% low-quality bases (quality value ≤5) were removed.

Short Read Alignment and Mapping
For short read alignment, the bovine reference genome (UMD 3.1), including regions that were not assembled into chromosomes (Zimin et al., 2009), were downloaded from the Ensembl database release 71 (Flicek et al., 2013) and indexed using SAMtools v0.1.19 (Li et al., 2009). Paired-end 100-bp short reads from each individual sample were mapped against the bovine reference genome assembly UMD 3.1 using BWA v0.7.5a with the default parameters. After mapping, for downstream SNP and insertion-deletion (indel) detection, the SAM files that were generated from BWA were converted to the corresponding binary equivalent BAM files and sorted simultaneously using SortSam.jar in Picard tools v1.102 1 . We used Picard tools to remove PCR duplicates from the aligned reads and then used the uniquely mapped reads for variant calling.

SNP and Indel Annotation and Gene Ontology Analysis
ANNOVAR (Wang et al., 2010) was used to annotate the functions of the variants (exonic, intronic, 5 and 3 UTRs, splicing, intergenic) using Ensembl release 71. SNPs that were identified in the exonic regions were classified as synonymous or nsSNPs. We performed GO analysis for genes containing nsSNPs and indels using the GO Analysis Toolkit and Database for Agricultural Community (AgriGO) (Du et al., 2010). Following the approaches by Kawahara-Miki et al. (2011), Liao et al. (2013), and , we selected genes containing >5 nsSNPs for each breed. The significantly enriched GO terms were assessed by Fisher's exact test with the Bonferroni correction using default parameters (P-value, 0.05; at least 5 mapping entries). Out of four indel classes (frameshift, non-frameshift, stopgain, and stoploss), we annotated frameshift indels in exonic regions using default parameters in ANNOVAR. Frameshift indels may change amino acid sequences and thereby affect protein function.

Identification and Annotation of Selective Sweeps
We investigated for all identified SNPs the signatures of selection using SFS-based α statistics in SweeD (Pavlidis et al., 2013) with default parameters, except setting the grid as the only parameter. SweeD detects the signature of selection based on the CLR test using SFS-based statistics. SweeD was run separately for each chromosome by setting the grid parameter at 5kb equidistant positions across the chromosome (size of the chromosome/5 kb). We used BEAGLE program ver.4 (Browning and Browning, 2007) to impute missing alleles and infer the haplotype phase for all individual Western Finncattle, Yakutian cattle, and Eastern Finncattle simultaneously (among the Eastern Finncattle, we excluded one inbred animal; see section "Results"). The BEAGLE program infers the haplotype information of each chromosome, which is required for α statistics. Following the approaches described in previous studies (Wang M. et al., 2014;McManus et al., 2015), we selected the outliers falling within the top 0.5% of the CLR distribution. The cutoff value for α statistics was taken as the 99.5 percentile of the empirical distribution of the 5-kb equidistant positions across the genome for each chromosome. Annotation of the candidate sites that exhibited a signal of selection was performed using Ensembl BioMart (Kinsella et al., 2011) by considering a 150-kb sliding window on the outlier sites. Candidate genes exhibiting signatures of selection were subjected to GO analysis with same parameters applied in the variant annotation using AgriGO.

Population Genetics Analysis
The average pairwise nucleotide diversity within a population (π) and the proportion of polymorphic sites (Watterson's θ) were computed using the Bio::PopGen::Statistics package in BioPerl (v1.6.924) (Stajich et al., 2002). PCA was conducted using smartpca in EIGENSOFT3.0 software (Patterson et al., 2006) on biallelic autosomal SNPs that were genotyped in all individuals. Significant eigenvectors were determined using Tracy-Widom statistics with the twstats program implemented in the same EIGENSOFT package.

Demographic History Inference
We used the PSMC model (Li and Durbin, 2011) to construct the demographic history of the three breeds. For the analysis, one individual per breed with the highest sequence depth was selected to explore changes in local density of heterozygous sites across the cattle genome. The following default PSMC parameters were set: −N25, −t15, −r5 and −p '4+25 * 2+4+6'. To scale the PSMC output to real time, we assumed a neutral mutation rate of 1.1 × 10 −8 per generation and an average generation time of 5 years (Kumar and Subramanian, 2002;Murray et al., 2010;MacLeod et al., 2013). As the power of the PSMC approach to reconstruct recent demographic history is not reliable (Li and Durbin, 2011;MacLeod et al., 2013;Zhao et al., 2013), we reconstructed a more recent demographic history of the Finnish and Yakutian populations using the ∂a∂i program (dadi-1.6.3) (Gutenkunst et al., 2009). We used the intergenic sites from the identified SNPs in the 15 individuals to compute the folded SFS. We merged the results for the Eastern and Western Finncattle breeds, as these breeds exhibited similar genetic diversity measures (Supplementary Figure S4). Since we had 10 Finncattle and 5 Yakutian samples, we downscaled the Finncattle sample size to be equal to that of the Yakutian cattle. We ran the ∂a∂i algorithm multiple times to ensure convergence and selected the optimal parameters with the highest likelihood as the final result. As ∂a∂i requires NA, we calculated NA using the formula NA = θ/4µL, where θ was the observed number of segregating sites divided by the sum of the expected SFS using the best-fit parameters of our model, L was the effective sequence length, and µ was the mutation rate per generation per site. We used a mutation rate of 1.1 × 10 −8 mutations per generation assuming that one generation was equal to 5 years (Kumar and Subramanian, 2002), and the effective sequence length (intergenic regions) was 10,836,904. We calculated population size and divergence time between the Finnish and Yakutian populations based on NA. Finally, using the parameters described previously, we generated the demographic model using ∂a∂i as shown in Supplementary Figure S5. The optimal model identified the change from the NA to the effective population size (nua) from the time Ta to the time Td. Ta was the time period when the change in NA started and Td was the time when the divergence between the Finnish and Yakutian cattle occurred. nu1F and nu2Y were the effective population sizes during the split. To calculate the statistical confidence in the estimated parameter values, we estimated the parameter uncertainties using the Hessian method (a.k.a. the Fisher information matrix).

Sequence Data
A total of 521 Gb of paired-end DNA sequence data was obtained after removing adapter sequences and low-quality reads ( Table 1 and Supplementary Table S1). On average, each sample had 347.4 million (M) reads, 98.45% of which were successfully mapped to the bovine reference genome UMD3.1 (Table 1 and  Supplementary Table S1), representing 13.01-fold coverage.

Identification and Annotation of Variants
A total of 17.45 M SNPs were detected in the mapped reads across all 15 samples, with Yakutian cattle exhibiting the highest number of SNPs (Table 2, Figure 2A and Supplementary Table S2). The average number of SNPs detected per individual within the breeds was 5.73, 6.03, and 7.12 M in Eastern Finncattle, Western Finncattle and Yakutian cattle, respectively (Supplementary Table S2). A total of 6.3 M (36.1%) SNPs were shared by the three breeds, and as expected, the Finnish breeds shared the highest number (n = 8.06 M, 46.2%) of SNPs (Figure 2A). Moreover, we   (Figure 2A). The transition-to-transversion (TS/TV) ratios were 2.20 and 2.23 in the Finncattle and Yakutian cattle, respectively (Supplementary Table S2). The observed Ts/Tv ratios were consistent with those observed in previous studies in mammalian systems (Lachance et al., 2012;Choi et al., 2013Choi et al., , 2014, indicating the quality of our SNP data. Of the SNPs identified in our analysis, 1.07 M (6.13%) SNPs were found to be novel when compared to NCBI dbSNP bovine build 150. At the breed level, 2.8, 2.6, and 4.5% of the total SNPs in the Eastern Finncattle, Western Finncattle and Yakutian cattle, respectively, were novel. Furthermore, out of the novel SNPs identified for each breed, 258,409 (83.3%), 219,234 (81.5%), and 528,763 (95%) were breed-specific SNPs in Eastern Finncattle, Western Finncattle, and Yakutian cattle (Figure 2B), respectively. A summary of the homozygous and heterozygous SNPs is given in Supplementary Tables S2, S3. One Eastern Finncattle cow (sample_3 in Supplementary Table S3) exhibited exceptionally low diversity, with only 1.66 M (32.58%) heterozygous and 3.44 M (67.42%) homozygous SNPs. This animal originated from an isolated, inbred herd and represented one relict Eastern Finncattle line (herd) that passed through the breed's demographic bottleneck (Kantanen et al., 2000). After excluding this sample, the average number of SNPs detected per Eastern Finncattle individual was 5.88 M, and the Eastern Finncattle animals exhibited 2.63 M (44.83%) homozygous and 3.24 M (55.17%) heterozygous SNPs, with a ratio of 1:1.23 (homozygous:heterozygous). Apparently, the number of homozygous SNPs in the Eastern Finncattle was higher than that in the other two breeds.
In total, we detected 2.12 M indels, 79.72% of which were found in the dbSNP build 150, with 20.28% being novel ( Figure 2C and Supplementary Table S2). At the breed level, 12.9, 11.6, and 16% of the total indels in the Eastern Finncattle, Western Finncattle, and Yakutian cattle, respectively, were novel.
In our data, on average, 0.65% of the SNPs were detected in exonic regions, 25.1% in intronic regions, 72.6% in intergenic regions, and 1.65% in UTRs and in regions upstream and downstream of genes (  Table 2) and were found in 10,309, 9,864, and 10,429 genes, respectively.
The functional categories of the indel mutations are presented in Table 2. In total, 1,045, 927, and 1,148 of the indels were frameshift indels that were associated with 808, 770, and 895 genes in Eastern Finncattle, Western Finncattle, and Yakutian cattle, respectively (Supplementary Datas S1-S3).
We computed the genotype concordance between the SNPs detected by our SNP calling pipeline and the previous SNP genotype study (Iso-Touru et al., 2016); Illumina BovineSNP50 BeadChip v.1.0 (Matukumalli et al., 2009). On average, we found 85.34% of SNPs detected by sequencing were concordant with the SNP50 BeadChips genotypes suggesting that the SNPs detected in our study with the current coverage (on average 13.01-fold coverage/sample) yielded sufficient genotypic accuracy.

GO Analysis of the SNPs and Indels
The GO enrichment analysis of 1,331, 1,170, and 1,442 genes containing >5 nsSNPs (Supplementary Datas S4-S6  A detailed comparison of the biological processes associated with genes with >5 nsSNPs with the bovine Ensembl gene set (n = 25,160) is shown in Supplementary Figure S1. The GO enrichment analysis revealed that a majority of the significantly enriched GO terms were shared by the three cattle breeds. "Response to stimulus, GO:00050896" was associated with approximately 50% of the genes in Eastern Finncattle (n = 611), Western Finncattle (n = 544), and Yakutian cattle (n = 629) (see Supplementary Figure S1). In addition, this analysis showed that in each breed, a large number of genes were associated with immune functions, such as "Immune response, GO:0006955, " "Defense response, GO:0006952, " "Antigen processing and presentation, GO:0019882, " and "Immune system process, GO:0002376." Among the three breeds, the Yakutian cattle had more enriched genes associated with immune functions than the two Finncattle breeds. On the other hand, in the Finncattle breeds, a large number of genes were associated with sensory perception functions, such as "Sensory perception, GO:0007600, " "Sensory perception of smell, GO:0007608, " and "Detection of chemical stimulus involved in sensory perception, GO:0050907." In Yakutian cattle, none of the GO terms associated with sensory perception were enriched. However, 55 genes associated with "Developmental growth, GO: 0048589" were enriched in only Yakutian cattle.
We further identified the top genes, namely, TTN, PKHD1, GPR98, and ASPM, that had at least 40 nsSNPs in all the breeds. These genes have large sizes; TTN is 274 kb in size, PKHD1 is 455 kb, GPR98 is 188 kb and ASPM is 64 kb. Among the genes with nsSNPs, TTN contained the highest number of nsSNPs: 68, 63, and 87 nsSNPs in Eastern Finncattle, Western Finncattle, and Yakutian cattle, respectively. The TTN gene is present on chromosome 2 and is associated with meat quality (Sasaki et al., 2006;Watanabe et al., 2011).
A total of 709, 675, and 772 genes associated with frameshift indels in these breeds were linked to at least one GO term (Supplementary Figure S2 and Supplementary Datas S10-S12). The results indicated that a majority of the significantly enriched GO terms were shared by the breeds. The GO terms "Defense response, GO:0006952" and "Female pregnancy, GO:0007565" were enriched exclusively in Yakutian cattle. In total, 96 genes were enriched in "Defense response, GO:0006952."

Selection Signatures
We identified 2,528 sites exhibiting signatures of selection in each breed, of which 58, 61, and 53% mapped to gene regions in Eastern Finncattle, Western Finncattle, and Yakutian cattle, respectively (Supplementary Figure S3). Information regarding the SNPs found in selective sweep regions in each breed is shown in Supplementary Table S5.
A total of 28, 67, and 13 GO terms were significantly enriched in Eastern Finncattle, Western Finncattle, and Yakutian cattle, respectively (Supplementary Datas S19-S21). We found only one significantly enriched GO term ("GMP binding, GO:0019002") that was shared by the three cattle breeds. The GO terms "Homophilic cell adhesion, GO:0007156, " "Calciumdependent cell-cell adhesion, GO:0016339, " and "Multicellular organism reproduction, GO:0032504" were shared by the Finncattle breeds. Most of the significantly enriched GO terms (23, 62, and 12 in Eastern Finncattle, Western Finncattle, and Yakutian cattle, respectively) were 'breed-specific' in our data. In addition, we examined the significantly enriched GO terms that were potentially involved in cold adaptation by assuming that in extremely cold environments, energy requirement is high and fat and lipids are the main sources of energy . The levels of fatty acids, lipids and phospholipids typically increase with decreasing temperatures (Puraae et al., 2011). The significantly enriched GO terms associated with Western Finncattle included "Lipid localization, GO:0010876, " "Lipid digestion, GO:0044241, " "Unsaturated fatty acid biosynthetic process, GO:0006636, " and "Unsaturated fatty acid metabolic process, GO:0033559." However, no significantly enriched GO terms associated with fatty acid and lipid metabolism and biosynthesis were identified in Eastern Finncattle and Yakutian cattle.

Population Genetics Analysis
The overall genome-wide genetic diversity, as measured by Watterson's θ and pairwise nucleotide diversity (π), were higher in the Yakutian cattle (0.001588 and 1.728 × 10 −3 , respectively) than in Eastern Finncattle (0.001445 and 1.559 × 10 −3 , respectively) and Western Finncattle (0.001398 and 1.512 × 10 −3 , respectively), and these results were inconsistent with those of previous studies based on autosomal microsatellite and SNP data sets, which showed that Finncattle were more diverse than the Yakutian cattle (e.g., Li and Kantanen, 2010).
We also applied PCA to examine the genetic relationships among the three cattle breeds. In the PCA plot, the Finncattle and Yakutian cattle were grouped in the first eigenvectors, indicating clear genetic differentiation (Supplementary Figure S4). The inbred Eastern Finncattle animal grouped separately from the other Finncattle animals.

Demographic Population Size History
The PSMC profiles of the contemporary Finnish and Siberian native cattle were used to construct the demographic prehistory and evolution of ancestral populations of northern Eurasian cattle. As shown in Figure 3, the temporal PSMC profiles of the FIGURE 3 | Demographic history of the northernmost cattle breeds reconstructed from three cattle genomes, one from each breed, by using PSMC. The X-axis shows the time in thousand years (Kyr), and the Y -axis shows the effective population size. three cattle genomes followed a similar pattern. The ancestral species of northern Eurasian taurine cattle, the near-eastern aurochs (Bos primigenius) (Kantanen et al., 2009a), experienced two population peaks starting at ∼1 Mya and ∼40 kya and two bottlenecks at ∼250 and ∼12 kya (Figure 3). After the first population expansion, the population size declined gradually. The second population expansion of the ancestral wild species began around ∼80 kya and started to decline around ∼30 kya, leading to a second bottleneck.
We also used the ∂a∂i program to reconstruct the recent northern European cattle demographic history (from 418 kya to the present). The parameters Ta, Td, nua, nu1F, and nu2Y in the demographic model are shown and explained in Supplementary Figure S5 and Supplementary Table S7. Based on this model, we estimated that the reference NA was 43,116. The optimal model fit for each parameter and CI are shown in Supplementary Table  S7 by fixing NA at 43,116 and generation time at 5 years. Our best-fit model indicated that the ancestral population underwent a size change to 51,883 (95% CI, 51,658-52,108) at 418 kya (95% CI, 413.96-409.47 kya) (Supplementary Table S7). This result is consistent with the PSMC profile (Figure 3). In addition, our model suggested that the divergence of North European native cattle and East Siberian Turano-Mongolian type of cattle occurred 8,822 years ago (95% CI, 8,775-8,869 years ago).

DISCUSSION
To our knowledge, this is the first whole-genome sequencebased report on the genetic diversity of Eurasian native cattle (B. taurus) breeds that have adapted to the northernmost cattle farming regions, even subarctic regions. The contemporary genetic resources of the Eastern Finncattle, Western Finncattle and Yakutian cattle breeds studied are the result of a complex process of genetic and demographic events that occurred during the domestication and selection and even the evolution of the ancestral species of northern Eurasian taurine cattle, namely, the near-eastern aurochs (B. primigenius).

Demographic Evolution of Bos primigenius
As shown in Figure 3, the auroch species (B. primigenius) experienced two notable prehistorical population expansions, after which the population size declined gradually. The first marked decline in the effective population size (Ne) occurred during the Middle Pleistocene period starting after ∼1 Mya, which may have been associated with reduction in global temperatures and even with negative actions of humans on the auroch population (Barnosky et al., 2004;Hughes et al., 2007). The second marked decline in Ne prior to domestication was obviously caused by dramatic climate changes during the last glacial maximum (Yokoyama et al., 2000). Although the sequencing depth attained in this study was not ideal for PSMC analysis (typically >20-fold coverage), our observations regarding the temporal changes in the Ne of the aurochs during the Pleistocene period (Mei et al., 2018) followed the pattern observed for ancestral populations of several other domestic mammalian species, such as pig [Sus scrofa; (Groenen et al., 2012)], horse [Equus caballus; (Librado et al., 2016)], and sheep [Ovis aries; (Yang et al., 2016)]. The ∂a∂i results confirmed the past fluctuations in the prehistorical Ne of B. primigenius (Supplementary Table S7), and the comparison of the current SNP-based estimated Ne of the present cattle breeds [∼100; (Iso-Touru et al., 2016)] to the Ne of the corresponding early domesticated ancestral populations showed that there was a dramatic decline in the Ne during domestication and breed formation. In addition, our demographic analysis (Supplementary Figure S5) provided new knowledge of the prehistory of northern Eurasian native cattle. As suggested by a previous study (Kantanen et al., 2009b), both the Finnish and Yakutian native cattle descended from the near-eastern aurochs domesticated 8,000-10,000 years ago. Here, our results have shown that the two northern Eurasian native cattle lineages may have already diverged in the early stage of taurine cattle domestication, more than 8,000 years ago.

High Genetic Variability in the Yakutian Cattle
The total number of sequence variants identified on average in Eastern Finncattle and Western Finncattle animals (e.g., 5.88 and 6.03 M SNPs, respectively, exhibiting a minor allele frequency > 0.05) corresponded well to numbers found typically in European taurine animals. In contrast, we found that the Yakutian cattle exhibited a higher number of SNPs on average per individual (7.12 M SNPs) than the number of SNPs detected in European and Asiatic humpless cattle to date (Tsuda et al., 2013;Choi et al., 2014;Szyda et al., 2015). According to (Szyda et al., 2015) and studies cited therein, a European taurine animal may exhibit on average 2.06-6.12, 5.89-6.37, 5.85-6.40, and 5.93 M SNPs, while (Choi et al., 2014)

detected 5.81 M SNPs in a Korean
Holstein cattle individual, a breed that originated from western Europe and North America. Typically, it may be possible to detect additional SNPs by increasing the sequencing depth (Szyda et al., 2015). In addition to the average number of SNPs per individual, total number of SNPs and number of indels, the Yakutian cattle exhibited the highest number of exonic SNPs and nsSNPs among the three northern native breeds studied. However, although the Yakutian cattle had the highest number of nsSNPs and genes with >5 nsSNPs, the functional annotation of the exonic SNPs by GO analysis indicated that the lowest number of significantly enriched GO terms was obtained for the Yakutian cattle.
Our estimates for the population-level diversity for the Eastern Finncattle, Western Finncattle, and Yakutian cattle [the nucleotide diversity (π) values were 1.559 × 10 −3 , 1.512 × 10 −3 , and 1.728 × 10 −3 , respectively, and the proportions of polymorphic sites (θ) were 0.001445, 0.001398, and 0.001588, respectively] exceed those typically found in European taurine cattle breeds (Kim et al., 2017;Chen et al., 2018a,b;Mei et al., 2018). We observed that Yakutian cattle such as the Asiatic taurine cattle breeds exhibit high levels of genomic diversity in terms of π and θ estimates. The typical nucleotide diversity values for the European taurine cattle are >1.0 × 10 −3 , while those for the Asiatic taurine breeds are closer to ∼2.0 × 10 −3 than to 1.0 × 10 −3 (Kim et al., 2017;Chen et al., 2018a,b;Mei et al., 2018). We observed higher within-population diversity for the Yakutian cattle than that observed for several other taurine cattle breeds, which differs from previous estimates based on autosomal microsatellites and whole-genome SNP data (Li et al., 2007;Iso-Touru et al., 2016), where lower levels of variation were observed in Yakutian cattle, indicating that the genetic variation in Yakutian cattle has been underestimated. The set of autosomal microsatellites recommended by FAO (the Food and Agricultural Organization of the United Nations) for biodiversity analysis of cattle breeds and the design of commercial SNP BeadChips used in cattle whole-genome genotyping were derived mainly from the genetic data of western breeds, causing a bias in the diversity estimates of clearly genetically distinct cattle breeds, such as Yakutian cattle (Li et al., 2007;Iso-Touru et al., 2016).
There could have been differences in the past effective population sizes of the European and Asiatic taurine cattle, and the present elevated genomic diversity of the Asiatic taurine cattle breeds may reflect the higher "ancient" effective sizes of the ancestral populations of the Asiatic taurine breeds (Chen et al., 2018a,b). However, the prehistory of domesticated cattle in East Asia appears to be more complex than previously thought (Zhang et al., 2013;Gao et al., 2017;Chen et al., 2018a,b), and an additional speculative explanation for the elevated genomic diversity in the Yakutian cattle and several other Asiatic taurine cattle breeds (or their ancestral populations) could be ancient introgression with the East Asian aurochs (B. primigenius) that lived in the East Asian region during the arrival of near-eastern taurine cattle (Chen et al., 2018a,b). The previous mtDNA and Y-chromosomal diversity study indicated the near-eastern origins of the ancestral population of the Yakutian cattle (Kantanen et al., 2009b). The possible hybrid origins of the Yakutian cattle ancestries may have increased the genetic variation in the ancestral population of Yakutian cattle seen even in the current population and may have played a pivotal role in the process of adaptation of the Yakutian cattle to the subarctic environment in the Sakha Republic, eastern Siberia.
The high number of SNPs and high genomic diversity found in the Yakutian cattle may be due partly to the breed's selection history: the artificial selection by humans has not been intensive (Kantanen et al., 2009b). The Yakutian cattle breed is an aboriginal taurine population, the gene pool of which has been shaped by natural and artificial selection. However, the centuries-old "folk selection" methods and traditional knowledge for the selection of the most suitable animals for the challenging subarctic environment followed the methods used by local people rather than the breeding implemented by organizations or institutions (Kantanen et al., 2009a). When compared with the Western Finncattle and Eastern Finncattle in the present study, the Yakutian cattle exhibited distinctly low numbers of candidate genes that exhibited selection signatures (n = 371, n = 331, and n = 249, respectively). Among these three breeds, Western Finncattle have been subjected to the most intensive artificial selection for milk production characteristics, while the production selection program of Eastern Finncattle was stopped in the 1960s, when the census population size of this native breed declined rapidly. Currently, in vivo and in vitro conservation activities are being implemented for Eastern Finncattle (and for Western Finncattle and Yakutian cattle). In addition, although Yakutian cattle had the highest number of genes containing SNPs (also nsSNPs) among the three breeds, the GO analysis indicated that this breed had the lowest number of significantly enriched GO terms (Eastern Finncattle,111;Western Finncattle,113;and Yakutian cattle,95). This difference between the native Finnish cattle and Yakutian cattle can be due to the differences in the selection histories of these breeds.

Genomic Characteristics of the Northern Eurasian Taurine Cattle Breeds
The GO enrichment analysis of genes harboring >5 nsSNPs indicated that genes related, e.g., to immunity and "response to stimulus" are overrepresented in the set of genes identified in the northern Eurasian native cattle breeds in this study. "Response to stimulus" refers to a change in the state or activity of a cell or an organism as a result of the detection of a stimulus, e.g., a change in enzyme production or gene expression (Gene Ontology Browser). This observation was consistent with previous cattle sequencing analyses (Choi et al., 2014;Stafuzza et al., 2017;Mei et al., 2018) and suggests that these genes were under positive selection during the course of cattle evolution and provided survival benefits, e.g., during environmental changes (Nielsen et al., 2007). Interestingly, genes related to the GO term "Sensory perception" were enriched in Eastern Finncattle and Western Finncattle but not Yakutian cattle. We performed a manual search for genes associated with "Sensory perception." We found that 47 of these genes exhibited >5 nsSNPs in Eastern Finncattle and Western Finncattle, most of which were olfactory receptor genes. We determined the number of SNPs and nucleotide diversity of this set of genes and found that the Yakutian cattle exhibited less variation than the two Finnish native breeds (the number of SNPs and π-estimates for Eastern Finncattle, Western Finncattle and Yakutian cattle were 2,298 and 1.864 × 10 −3 ; 2,091 and 1.792 × 10 −3 ; 1,478 and 1.113 × 10 −3 , respectively), which is in contrast to the number of SNPs and π-estimates obtained for the entire genomes of the breeds. Great variations in the number of olfactory receptor genes and structural variations in these genes among mammalian species and even individuals within species (e.g., in humans) have been interpreted as reflecting the effects of environmental factors on the genetic diversity of this multigene family and demonstrate the importance of these genes from the evolutionary point of view (Niimura, 2011;Niimura et al., 2014). Therefore, we hypothesize that the reduced genetic diversity in the evolutionarily important genes in Yakutian cattle could be associated with gradual adaptation to the challenging subarctic environment along with human movements from the southern Siberian regions to more northern environment (Librado et al., 2015). Cattle (and horses) may have been introduced to the Yakutian region after the 9th century, perhaps as late as the 13th century (Kopoteva and Partanen, 2009). Compared to European taurine cattle, this is a relatively short time period in terms of intervals between cattle generations. In our study, genes related to the GO term "Developmental growth" were enriched in only Yakutian cattle. Stothard et al. (2011) suggested that genes associated with the GO term "Growth" may be related to the increase in the mass of intensively selected Black Angus (beef breed) and Holstein (dairy breed) cattle. However, Yakutian cattle have not been selected for increased body size as that would be less desirable characteristic in Yakutian conditions. Instead, we hypothesize that the enrichment of these growthrelated genes in Yakutian cattle may be a signature of adaptation to the harsh environment. The Yakutian cattle exhibit unique morphoanatomical adaptations to the subarctic climate and are characterized by their small live weights (adult cows typically weigh 350-400 kg, with heights of 110-112 cm); deep but relatively narrow chests; and short, firm legs (Kantanen et al., 2009a). The Yakutian cattle are unique remnants of the Siberian Turano-Mongolian type of taurine cattle (Kantanen et al., 2009a) and can be distinguished from the European humpless cattle by these anatomical characteristics.
We performed genome-wide selection-mapping scans for the three northern cattle breeds and found a great majority of SNPs exhibiting selection signatures in non-coding genomic regions. This finding indicates that selection occurs specifically via the regulatory elements of genomes [see also (Librado et al., 2015)]. We found that the studied breeds exhibited 'private' (breedspecific) selection signature patterns, indicating distinctiveness in their selection histories. We further investigated the proportions of genes exhibiting selection signatures among the breeds and found that only 5 genes from this set of genes were shared by the three breeds. Only 37 genes were shared by the two Finnish native cattle breeds, while 13 'selection signature' genes were shared by Western Finncattle and Yakutian cattle and 11 by Eastern Finncattle and Yakutian cattle. In addition, the breeds did not share any of the genes exhibiting the strongest selection signatures and harboring >5 nsSNPs, and the GO term enrichment analysis of this set of genes indicated that only one GO term ("GMP binding") was significantly enriched in all three breeds.
We identified several positively selected candidate genes underlying adaptation, appearance and production of Eastern Finncattle, Western Finncattle, and Yakutian cattle. For example, in Eastern Finncattle, selection signatures were detected in NRAP and IGFBP5, both of which have been previously identified as candidate genes for muscle development and meat quality in cattle (Williams et al., 2009), and in NOD2, which is a candidate gene for dairy production (Ogorevc et al., 2009). In Western Finncattle, we detected selection signatures in, e.g., candidate genes for beef production, such as COX5B and ITGB3 (Williams et al., 2009), and dairy production, such as CD14 (Ogorevc et al., 2009). In Yakutian cattle, several genes exhibiting selection signatures were candidate genes for muscle development and meat quality, such as COX7A1, THBS3, PFKM, and SOCS3 (Williams et al., 2009) but also for color pattern [ADAM17; (Gutiérrez-Gil et al., 2015)] and milk production traits [MUC1; (Ogorevc et al., 2009)]. We were particularly interested in the genomic adaptation to North Eurasian environments. Recently, (Yurchenko et al., 2018) identified several candidate genes in Russian cattle breeds, such as RETREG1 and RPL7, for adaptation to harsh environment, e.g., to cold. Our data with relatively limited number of animals sequenced here did not point toward statistically significant selection in these genes. However, we present in the Section "Results" of our paper several genes exhibiting significant selection signatures which are associated with biological processes and pathways hypothesized to be involved in cold adaptation in indigenous Siberian human populations in terms of response to temperature, blood pressure, basal metabolic rate, smooth muscle contraction and energy metabolism (Cardona et al., 2014). SLC8A1 (sodium/calcium exchanger 1), influencing the oxidative stress response, is an example of the genes with significant selection signatures in Yakutian cattle, Siberian human populations (Cardona et al., 2014) and native Yakutian horses (Librado et al., 2015). This example of selection signatures and associated genes found in the Yakutian cattle and Siberian human populations (Cardona et al., 2014) indicates convergent evolution between the mammalian populations adapted to subarctic environments. Convergent evolution between mammalian species in adaptation to harsh environments has also occurred, e.g., on the Tibetan plateau, as indicated by (Wang G.-D. et al., 2014;Wang et al., 2015;Yang et al., 2016).

CONCLUSION
We have investigated by whole-genome sequencing for the first time the genetic diversity of native cattle breeds originating from the northernmost region of cattle farming in the world. We found novel SNPs and indels and genes that have not yet been annotated. Our observations suggest that accurate reference genome assemblies are needed for genetically diverse native cattle breeds showing genetic distinctiveness, such as Yakutian cattle, in order to better understand the genetic diversity of the breeds and the effects of natural and artificial selection and adaptation. We identified a number of genes and chromosomal regions important for the adaptation and production traits of the breeds. Moreover, GO terms such as defense response, growth, sensory perception and immune response were enriched in the genes associated with selective sweeps. To improve our knowledge of the value of native breeds as genetic resources for future cattle breeding and the power of selection signature analyses, a greater number of animals of these breeds should be investigated in a wider breed diversity context.

DATA AVAILABILITY STATEMENT
The raw sequence reads (Fastq Files) for this study can be found in European Nucleotide Archive (ENA) under the accession number PRJEB28185 (please see Supplementary Table S1 for sample specific accessions).

AUTHOR CONTRIBUTIONS
JK designed the study and revised the manuscript. MW performed the bioinformatics and statistical analyses and drafted the manuscript. JK, RP, IA, and ZI collected the samples. RP, KP, IA, YM, and ZI participated in the experimental design and paper revision. All authors read and approved the final manuscript.

FUNDING
This work was supported by the Academy of Finland (the Arctic Ark-project no. 286040 in the ARKTIKO research program of the Academy of Finland), the Ministry of Agriculture and Forestry in Finland, and the Finnish Cultural Foundation. MW was partly supported by grants from the Betty Väänänen Foundation and Ella and Georg Ehrnrooth foundation, Doctoral School of University of Eastern Finland in Environmental Physics, Health and Biology.