Skip to main content

ORIGINAL RESEARCH article

Front. Genet., 23 October 2020
Sec. Evolutionary and Population Genetics

Y-Chromosome Genetic Analysis of Modern Polish Population

\r\n&#x;ukasz GrochowalskiŁukasz Grochowalski1Justyna Jarczak,Justyna Jarczak1,2Maria UrbanowiczMaria Urbanowicz1Marcin S&#x;omka,Marcin Słomka1,2Maria Szargut,Maria Szargut3,4Paulina BorwkaPaulina Borówka5Marta Sobalska-Kwapis,Marta Sobalska-Kwapis1,2B&#x;aej Marciniak,Błażej Marciniak1,2Andrzej Ossowski,Andrzej Ossowski3,4Wies&#x;aw LorkiewiczWiesław Lorkiewicz5Dominik Strapagiel,*Dominik Strapagiel1,2*
  • 1Biobank Lab, Department of Molecular Biophysics, Faculty of Biology and Environmental Protection, University of Lodz, Łódź, Poland
  • 2BBMRI.pl Consortium, Łódź, Poland
  • 3Department of Forensic Genetics, Pomeranian Medical University in Szczecin, Szczecin, Poland
  • 4The Polish Genetic Database of Totalitarianism Victims, Szczecin, Poland
  • 5Department of Anthropology, Faculty of Biology and Environmental Protection, University of Lodz, Łódź, Poland

The study presents a full analysis of the Y-chromosome variability of the modern male Polish population. It is the first study of the Polish population to be conducted with such a large set of data (2,705 individuals), which includes genetic information from inhabitants of all voivodeships, i.e., the first administrative level, in the country and the vast majority of its counties, i.e., the second level. In addition, the available data were divided into clusters corresponding to more natural geographic regions. Genetic analysis included the estimation of FST distances, the visualization with the use of multidimensional scaling plots and analysis of molecular variance. Y-chromosome binary haplogroups were classified and visualized with the use of interpolation maps. Results showed that the level of differentiation within Polish population is quite low, but some differences were indicated. It was confirmed that the Polish population is characterized by a high degree of homogeneity, with only slight genetic differences being observed at the regional level. The use of regional clustering as an alternative to counties and voivodeships provided a more detailed view of the genetic structure of the population. Those regional differences identified in the present study highlighted the need for additional division of the population by cultural and ethnic criteria in such studies rather than just by geographical or administrative regionalization.

Introduction

The structure and variability of the modern Polish population have arisen as a result of the demographic and political changes that have formed the populations of this part of Europe. Poland was first regarded as a nation with the beginning of the Piast state (the so-called first Polish state) in the 10th century AD. The early history of the inhabitants of the land between the Oder and Bug rivers is inseparably connected with the discussion on the ethnogenesis of the Slavs. According to the autochthonous hypothesis, the Slavs developed and lived in the Oder and Vistula basins, and their roots in this area extend back to 1,200 to 1,000 years BC. In contrast, the allochthonous theory assumes that the Slavs arrived in this area between the fifth and sixth century CE from the Upper Dnieper basin, an area believed to be their cradle (Trzeciecki, 2016). This 100-year-old discussion has recently been joined by anthropologists and geneticists studying modern mtDNA and Y-chromosome polymorphisms (Malyarchuk et al., 2002, 2008; Branicki et al., 2005; Grzybowski et al., 2007; Rebala et al., 2007, 2013; Wozniak et al., 2010; Mielnik-Sikorska et al., 2013a) and recently also ancient DNA (Juras et al., 2014).

Polish modern history, especially during the last 200 years, was rich in dramatic events such as wars, occupations, borders shifting, and political migrations. However, the greatest influence for the shaping of modern demographic situation had consequences of World War II (WWII). Until that time, population of Poland was an ethnic, religious, and linguistic mosaic, in which people have coexisted together for centuries [native Polish in 1939-65.5% population (Polish Ministry of Information, 1941)]. The final number of victims during WWII was estimated at more than 6 million of Polish citizens (Polish War Reparations Bureau, 1947), which meant greater than 17% of prewar population of Poland (Polish Ministry of Information, 1941). Because of hostilities, young men constituted a large part of this number whose death resulted in significant depletion of gene pool (Diepenbroek et al., 2019).

Furthermore, the borders of Poland have been radically shifted, which triggered significant demographic changes such as mass resettlements and human migrations. Therefore, millions of people of different ethnicity were suddenly forced to leave their immemorial residence in mass migrations (Eberhardt, 2000). In years 1944–1948, from lands that belonged to Poland before the WWII and have been incorporated to Soviet Republics, around 800,000 Polish people have been officially resettled from Ukrainian SSR (Kersten, 1974; Czerniakiewicz, 1987), which means as much as 96% people registered there for transfer (Piesowicz, 1988). The official migrants were resettled to area between Upper and Lower Silesia (Hryciuk et al., 2008). From the Byelorussian SSR, around 300,000 of Polish people have been resettled (33.5% registered for transfer) (Kersten, 1974; Czerniakiewicz, 1987; Piesowicz, 1988) to Lower Silesia, western part of Greater Poland, Lubusz, Szczecin in West Pomerania, and Gdańsk in Pomerania (Hryciuk et al., 2008). From Lithuanian SSR, around 200,000 of Polish people have been resettled (51.5% registered for transfer) (Kersten, 1974; Czerniakiewicz, 1987) to Warmian–Mazurian, Pomerania, and some of them to Lower Silesia (Hryciuk et al., 2008). Moreover, around 250,000 of Polish people have been also officially resettled from the Soviet Union (Kersten, 1974) (Supplementary Figure S1). About 3 million people also moved there from the rest of Polish territory, comparing almost 1.2 million of native Polish who have already lived in Upper Silesia and Warmian–Mazurian as the indigenous (Kosiński, 1960; Eberhardt, 2000). Furthermore, at the same time almost 2 million Polish people returned to Poland from Western Europe (Kersten, 1974) (Supplementary Figure S2). In years 1955–1959, the next wave of resettlements took place, and 250,000 native Polish have been displaced from the Soviet republics to the new western Polish lands (Latuch, 1994) (Supplementary Figure S1). Other ethnic populations have been displaced in the same way: several millions of Germans moved from new Polish lands to Germany and majority from around 700,000 indigenous Ruthenians and Ukrainians from Subcarpathian were resettled to Ukrainian SSR and 140,000 in operation “Wisła” forcely moved to Lower Silesia, West Pomerania, and Warmian–Mazurian (Eberhardt, 2000) (Supplementary Figure S2).

In summary, in Poland within the past 80 years, more than 11 million people of both Polish and non-Polish descendance have been moved either to or from the country (Ploski et al., 2002). The genetic structure of the country has changed between the prewar and postwar period dramatically (Rebala et al., 2013; Diepenbroek et al., 2019).

Modern population studies are often based on genome-wide analysis studies, most commonly employing single-nucleotide polymorphism (SNP) microarray technology; this approach is capable of identifying disease-related or trait-related variants and is essential for the advancement of personalized or forensic medicine (Tam et al., 2019). However, analysis of the SNPs related with an allosome locus can also be of great value in anthropological and forensic research, as they appear to carry key information about the genetic diversity of a certain population. Knowledge of the phylogenies of the paternally inherited portion of the non-combining region of chromosome Y (NRY) can be acquired by examining the patterns of Y-short tandem repeats (Y-STR); these are subject to a higher mutation rate and thus demonstrate higher typing resolution than the more slowly evolving Y-chromosomal biallelic polymorphisms (Rosser et al., 2000; Gill et al., 2001).

Previous studies tracing paternal lineages and kinship in different parts of the country have analyzed Y-STR haplotype and allele frequencies of Polish men (Pepinski et al., 2004b; Rebala and Szczerkowska, 2004; Wozniak et al., 2007; Soltyszewski et al., 2008; Wolanska-Nowak et al., 2009), as well as studies performed on the representatives of selected cities (Ploski et al., 2002; Kayser et al., 2005; Rebala and Szczerkowska, 2005), and among ethnic groups (Rebala et al., 2007, 2013; Janica et al., 2008), minorities, and residents (Pepinski et al., 2004c, 2005a,b; Janica et al., 2006). These studies have typically employed residual polymerase chain reaction (PCR)–based Y-chromosomal biallelic polymorphism estimation (Rosser et al., 2000), autosomal (Behar et al., 2013), and whole-genome approaches (Lao et al., 2008).

Our study presents a full analysis of the Y-chromosome variability of the modern male Polish population. It is the first study of the Polish population to be conducted with such a large set of data (2,705 individuals), which includes genetic information from inhabitants of all voivodeships, i.e., the first administrative level, in the country and the vast majority of its counties, i.e., the second level. In addition, the available data were divided into clusters corresponding to more natural geographic regions. The obtained results, as yet unpublished, estimate the missing genetic variability of the modern Polish population and examine the genetic relationships between its members, allowing researchers to shed light on the historical, demographic, and social changes that have occurred during the turbulent history of the country. They represent an excellent complement to earlier mtDNA studies on the diversity of the Polish population (Jarczak et al., 2019).

Materials and Methods

Subjects

Adult participants were recruited between 2010 and 2012 under the TESTOPLEK project based on general Polish population—POPULOUS collection of 10,000 saliva samples, derived from female and male attendees, completed with individual in-depth interview based on questionnaires. These recorded their place of residence, together with various other questions about the origin or ancestry of parents and grandparents. Saliva samples were collected up to 2016 and collectively have been included to POPULOUS collection at the Biobank Lab of the Department of Molecular Biophysics of the University of Lodz (Strapagiel et al., 2016; Dobrowolska et al., 2019), which is currently registered in Directory (v. 4.0) of BBMRI-ERIC consortium under bbmri-eric:ID:PL_BLUL:collection:POPULOUS_BLUL registration number. Approval for this study was obtained from the University of Lodz Ethics Review Board. All procedures were performed in accordance with the Declaration of Helsinki (ethical principles for medical research involving human subjects).

Finally, a group comprising 2,705 adult male inhabitants of all 16 Polish voivodeships was assembled for the present study. These participants were found to represent 337 of 380 counties (in Polish: powiaty). The regional data were assembled into 40 clusters, thus providing a high-resolution overview of the diversity of modern-day male Polish population (Supplementary Figure S3).

Clustering and Visualization

Cluster formation allowed data from counties with low sample sizes to be merged, to provide a greater density of points than analysis based on voivodeships alone. The data from the counties were merged into 40 clusters using the K-means method (Jarczak et al., 2019).

Clustering was carried out using Python (v.3.7.4) with Scikit-learn package (Pedregosa et al., 2011). The approach resulted in the formation of a number of regions, whose lowest cluster size was 30, and the most numerous was 301. The list of counties and their resulting clusters can be found in Supplementary Table S1.

The geographical representation of the haplogroup frequencies was performed using QGIS (v.2.18.16). Surface interpolation was carried out using the Inverse Distance Weighted method on a valid administrative map of Poland downloaded from the Geodesic and Cartographic Documentation Center website. The longitude and latitude of the counties were obtained with the Google Maps Api.

Sampling and Genotyping

Saliva was collected from each individual using Oragene OG-500 DNA storage probes. Genomic DNA was manually extracted with PrepitL2P® (PD-PR-052, DNA Genotek, Canada), and the samples were genotyped using Infinium HTS Human Core Exome PLUS microarrays (Illumina, Inc., San Diego, CA, United States), according to the manufacturer’s protocol. Quality control of obtained results was performed by examining raw fluorescence intensities in GenomeStudio (v.2011.1) with Genotyping Module (v.1.9.4) (Illumina, Inc.); all samples met the criteria, demonstrating a call rate greater than 0.98 with the 10% GenCall parameter above 0.4. A total of 1,755 SNPs (Supplementary Table S2) located on the Y-chromosome passed QC and were included in the analysis. StrandScript (Wang et al., 2017) was used to correct strand orientation. The full set of data from genotyping can be found at the European Genotype Archive—the accession number for the Y chromosome microarray data of Polish population reported in this article is EGAS00001004111.

Bioinformatics Analysis

Genetic variation between, and within, voivodeships and clusters was quantified by analysis of molecular variance (AMOVA) using Arlequin (v.3.5) (Excoffier and Lischer, 2010). Arlequin was also used to calculate pairwise genetic distance (FST) for clusters and voivodeships based on the obtained Y-SNP data (n = 1,755 SNPs). The statistical significance of the Arlequin analysis was assessed using 10,000 permutations. The pairwise genetic distances were visualized by multidimensional scaling (MDS) analysis using the cmdscale function in R (v.3.4.2).

yHaplo (v.1.0.19) (Poznik, 2016) performed Y-SNP binary haplogroup assignments on 496 informative SNPs. Haplogroup frequencies were calculated for voivodeships and clusters. Links to all web resources mentioned in the text are listed in Appendix A.

Results

A total of 2,705 unrelated males from the Polish population with place of residence were included in the study. The list of typed haplogroup for each sample is included in Supplementary Table S3. The analysis of allele distribution among the studied samples revealed 12 different haplogroups, of which R was divided into subhaplogroups R1a and R1b for better resolution (Table 1).

TABLE 1
www.frontiersin.org

Table 1. Main haplogroups and selected subhaplogroups frequencies for Polish population including division into voivodeships (n = 2,705).

The most frequent Y-SNP binary haplogroups in all analyzed samples were found to be R (71.02%), I (15.71%), N (4.29%), E (3.84%), J (3.22%), and G (1.22%). The total contribution of the others, viz. Q, C, T, H, and O, totaled less than 1% (0.70%), and each comprised only individual samples (Table 1).

The samples were divided to visualize the distribution of haplogroups according to voivodeship. Most were characterized by the presence of six or seven haplogroups (hgs), with only Silesia (10 hgs) and Lublin (9 hgs) being more diverse. While in Silesia this high number may be attributed to the higher number of samples recorded, Lublin, with one less haplogroup identified, recorded a similar number of samples to the other voivodeships. Additionally, most of the voivodeships did not differ with regard to the number of haplogroups, which suggests the population is highly homogeneous (Table 1).

In all voivodeships, hg R was the most common, with the highest frequency observed in the Lodz voivodeship (86.72%) and lowest in Lower Silesia (62.34%) (Table 1). Interestingly, Lodz is represented almost only by haplogroups R and I, accounting for 93.80% of the samples.

A deeper investigation of haplogroup distribution was carried out based on the clusters. Haplogroup R is unevenly distributed in Polish population with the central part of the country marked by the highest frequencies (Figure 1). When hg R was divided into subhaplogroups, one can see that R1a is distributed mostly in the center part of Poland with a few regions in the west and east of the country. R1b is most widely distributed on the territory of Poland, reaching farther east and west (Figure 1).

FIGURE 1
www.frontiersin.org

Figure 1. Interpolation maps for the two main haplogroups (R and I) with the division (in case of hg R) into subhaplogroups R1a and R1b observed in the Polish population.

Interpolation map of haplogroup I shows that it is more evenly represented in the Polish population but some trends are indicated. The highest frequencies are observed in western Poland and in some regions of eastern Poland mostly in Podlaskie and Lublin voivodeships but reaching also eastern parts of Mazovia, western parts of Warmian–Mazurian, and almost all Subcarpathian (Figure 1). Haplogroup N is observed mostly in all Podlaskie voivodeship. In the case of haplogroups E and J, the differences are not so highlighted, and a much greater diversity of frequencies is observed (Figure 2).

FIGURE 2
www.frontiersin.org

Figure 2. Interpolation maps for the other main haplogroups (N, E and J) observed in the Polish population.

The maps in Figures 1, 2 present an interpolated distribution of the seven most frequent haplogroups in the Polish population.

Genetic Differences (FST)

To identify changes in genetic distance across the population, voivodeships and clusters were compared by the FST metric, which ranged from 0.0001 to 0.09123, depending on the tested voivodeship (Supplementary Table S4). The highest FST values were observed between Lodz and Lower Silesia (FST = 0.09123; p < 0.00001), as well as between Lodz and Podlaskie (FST = 0.085; p < 0.00001) (Supplementary Table S4 and Supplementary Figure S4). The results identified Lodz as an outlier, being significantly different to the 14 other voivodeships. Lower Silesia demonstrated the second highest number of statistically significant FST values. Only the Lodz and the Kuyavian–Pomeranian voivodeship pair demonstrated no differences.

Furthermore, an MDS plot, constructed on the basis of pairwise FST values, clearly shows that most voivodeships form a compact group and that the Lodz, Lublin, Kuyavian–Pomeranian, and Holy Cross voivodeships lie outside them (Figure 3).

FIGURE 3
www.frontiersin.org

Figure 3. Two-dimensional MDS plot of Polish voivodeship populations based on pairwise FST values.

The paired FST analysis performed for clusters returned values ranging from −0.018 to 0.192 (Supplementary Table S5). The highest FST estimates were identified between clusters 20 (Lower Silesia—area of Jelenia G ra and Zgorzelec) and 30 (Warmian–Mazurian—area of Giżycko, Ełk, Gołdap) (FST = 0.10778, p = 0.01562); between clusters 20 and 32 (Greater Poland—Konin, Kalisz, and Sieradz counties) (FST = 0.10776; p = 0.00098), and between 20 and 28 (a cluster on the border of Silesia, Lodz, and Opole) (FST = 0.10692; p = 0.00488) (Supplementary Figure S5 and Supplementary Table S5). Interestingly, clusters 20 and 12 (Subcarpathian region including Przemyśl, Sanok, and the Bieszczady mountains) demonstrated the same relations with clusters 30, 28, and 32 (FST = 0.09196; FST = 0.09144; FST = 0.09085, respectively p = 0.01074; p = 0.00781; p = 0.00293). In addition, 20 and 12 did not demonstrate significant differences in the number of estimates, despite being located on opposite sides of the country: 20 is in the southwest of Poland, close to the border with Germany, whereas 12 is found in the southeast, close to the border with Ukraine. Additionally, the highest number of statistically significant pairwise FST estimates was observed in clusters 20 (18 estimates) and 32 (17 estimates) (Supplementary Table S5).

Another MDS plot was constructed to visualize the relationships between generated clusters (Figure 4). In this case, a large group was formed including almost all clusters apart from the following: 12 (Bieszczady region), 14 (region of Słupsk), 20 (region of Jelenia Góra, Bolesławiec, and Zgorzelec), 28 (region of Wieluń, Częstochowa, and Lubliniec), 30 (Mazury region), 32 (region of Konin, Kalisz, and Ostrów Wielkopolski), and 35 (region of Włocławek and Kutno) (Figure 4).

FIGURE 4
www.frontiersin.org

Figure 4. Two-dimensional MDS plot of cluster populations based on pairwise FST values.

Analysis of Molecular Variance

Analysis of molecular variance analysis found that, for voivodeships, 99.25% of the variation was within the population and 0.75% among populations. Similar results were observed for the clusters: 98.73% of variation was within the population and 1.27% among populations. The Fixation Index was found to be 0.00746 for the voivodeships and 0.01269 for the clusters, with p = 0.00426, p = 0.01119, respectively (Table 2).

TABLE 2
www.frontiersin.org

Table 2. Analysis of molecular variance (AMOVA) accounting for all voivodeships and clusters.

Discussion

The genetic variability of the Y chromosome across the Polish population has been analyzed over the years, in studies in regard to different regions of Poland (Pepinski et al., 2001; Janica et al., 2005; Rebala and Szczerkowska, 2005; Soltyszewski et al., 2007; Wozniak et al., 2007; Wolanska-Nowak et al., 2009; Kostrzewa et al., 2013), among Lithuanian, Byelorussian, and Tatar minorities living in Poland (Pepinski et al., 2004c, 2005a; Janica et al., 2005) and in studies of larger population groups, including the entire population of the country (Lessig et al., 2001; Ploski et al., 2002; Kayser et al., 2005; Lessig et al., 2008; Soltyszewski et al., 2008; Rebala et al., 2013). Most of these studies were based on the PCR analysis of STRs. In contrast, the present study was performed using a microarray approach, which allowed the identification of several SNP on the Y chromosome; this approach yielded a detailed description of the genetic structure of the male population in Poland according to its voivodeships and counties and their clusters.

Haplogroup prediction was performed based on 496 SNP markers included in the Infinium HTS Human Core Exome microarray. Because the panel does not allow for differentiation of all possible haplogroups within the Eurasian metapopulation, only main haplogroups were considered for calculation of frequencies within specific voivodeships.

Interpopulation Variability of Y Chromosome

For the sake of interpopulation analysis (including haplogroup frequencies from Slovakia, Slovenia, Czechia, Ukraine, Russia, Lithuania, Latvia, and Germany), we decided to use our results at the level that would allow for comparison with each country. The approach of different level of haplogroup estimation chosen for different types of analysis was also successfully applied by Altena et al. (2020).

Our results showed to be highly consistent with those obtained by Kayser et al. (2005), performed on a group of 913 Polish males. The frequency of R1a1 was almost exactly equal within both studies [57% in Kayser et al. (2005) and 56.93% in our sample]. Similarly, frequencies of haplogroups I and R1b were also comparable for both datasets (17.3 vs. 15.71 and 11.6 vs. 14.09% for hgs I and R1b, respectively). Because of lack of markers for hgs E3b (M35) and N3 (M46) within the microarray used in the hereby presented study, we were not able to calculate exact frequencies of those hgs. Both of those are, however, subhaplogroups of hgs included in our results. It can be assumed that at least some part of hg E (3.84%) belongs to either E3b (M35) [4.5% (Kayser et al., 2005)] or DE (xE3b) (YAP) [0.5% (Kayser et al., 2005)], whereas the frequency of hg N (4.29%) is most probably a sum of N3 (M46) [3.7% (Kayser et al., 2005)] and K (xN3, P) (M9) [0.5% (Kayser et al., 2005)]. The results’ concordance applies also to haplogroups with lower frequencies for the Polish population: J2 (M172) [2.5% (Kayser et al., 2005)] was predicted for 2.37% of samples, F (xI, J2, K) (M89) [2.0% (Kayser et al., 2005)]—for 2.11% of the population, and P (xR1a) (M74) [0.3% (Kayser et al., 2005)] for 0.26% of the population.

As an insight into the most recent Polish population, we performed a haplogroup prediction based on 496 27-Y-STR haplotypes published in 2017 by Spolnicka et al. (2017). A high level of similarity between both datasets is visible; however, lack of prediction for 140 samples (>25% of the studied sample set) seems to be the main reason for the inconsistencies found. One of those is the overrepresentation of haplogroup R1a [56.93 vs. 68.6%—haplogroup prediction based on Spolnicka et al. (2017)] and the remaining—the underrepresentation of hg I [15.71 vs. 6.8%—haplogroup prediction based on Spolnicka et al. (2017)]. The frequencies of some of the remaining haplogroups predicted (R1b, N, G, Q) are consistent with our findings. This bias clearly shows the necessity of using the biallelic markers for the purpose Y-chromosomal haplogroup determination.

While a part of both the Central and Eastern Europe and Baltic Rim Countries, Poland does vary from its neighboring countries in terms of the Y-chromosomal haplogroup structure at least at some level (for details, see Supplementary Table S6 with all national frequency data discussed below included). Results obtained in the hereby presented study are shown to be similar to the haplogroup frequencies of Slovenia (Zupan et al., 2013)—an Eastern Slavic country, and two countries considered as Western Slavic (Wozniak et al., 2010): the Czechia (Zastera et al., 2010) and Slovakia (Petrejcikova et al., 2010). The populations of those countries are considered homogenous (Rebala et al., 2007). This is especially the case for Poland and Czechia, as confirmed by the PCA of autosomal biallelic markers studied by Lao et al. (2008). In our case, the main difference between Slovenia, Czechia, Slovakia, and Poland laid in the frequency of hg R1a, found in almost 57% Polish males, whereas only between 36.9% (Slovenia) and 38% (Slovakia) for the aforementioned nations. Both Slovenia and Czechia are also characterized by a much higher level of hg R1b (20.3 and 24.8%, respectively), whereas for Slovakia the level of R1b seems similar to that of Poland (13.2 vs. 14.09%, respectively). Both Slovenians and Slovakians often fall within hg I (28.3 and 27.2%, respectively). Hg I is also frequently found in Czechia (20.1%), whereas in our results obtained for Poland its frequency is established at 15.7%. Hgs with lower frequencies, contributing to 12.57% of the Polish population (J, G, E, and N), are also found within all three of the aforementioned countries, the only exception being haplogroup N, not present in the Slovenian population. Those haplogroups sum up to 12.2, 17.2, and 17.4% of Slovenian, Slovakian, and Czechia populations, respectively.

The populations of Lithuania (Kasperaviciute et al., 2004) and Latvia (Pliss et al., 2015) seem genetically more distant from Poland, regardless of the Polish-Lithuanian Union that lasted for more than 400 years between the XIV and XVIII century (Ploski et al., 2002). In both of those countries, hg N is one of the two most commonly found haplogroups (36.7 and 41.5%, respectively), present only in 4.29% of Polish population, with the other most frequent hg being R1a (44.9 and 37.8%, respectively). R1a is the most common haplogroup in Poland, found in almost 57% of the population. The Germanic R1b haplogroup is found in Latvia and Lithuania on a much lower level than in Poland, understandably (Wozniak et al., 2010). For Lithuania its frequency is estimated to be below 5.1% [as (Kasperaviciute et al., 2004) did not differentiate between R1b and Q, this is the sum of both] and for Latvia—7.6%, which is almost three and two times less than what can be found in Poland, respectively.

As Maliarczuk and Derenko (2008) investigated levels of haplogroup frequencies through the European part of Russia, some conclusions can be drawn regarding their similarity and differences to the population of Poland, also in comparison to the in-between Ukraine (Mielnik-Sikorska et al., 2013b). For both Russia and Ukraine, hg R1a is still common [Northern Russia (NR)—34.2%, CR (Central Russia)—46.54%, South Russia (SR)—55.4%, Ukraine—43.9%]; however, in NR, hg N is the most frequent one (43% of the population). For CR and SR, the value of haplogroup N frequency is lower (17.2 and 10%, respectively), yet much higher than for Poland (4.29%). Haplogroup N was not found by Mielnik-Sikorska et al. (2013b) within the Ukrainian population. Similarly to Lithuania and Latvia, both Russia and Ukraine are much lower in R1b subhaplogroup than Poland (Ukraine and NR—5.4%, CR—7.1%, SR—4,8%). Haplogroup I is found with a high frequency in Ukraine and SR (28.4 and 21%, respectively) and CR and NR (17.5 and 13.1%, respectively), unlike in Poland, where we calculated it can be found in greater than 6% of the population. In all of the aforementioned countries, haplogroup J is found in less than 5% of the population (Ukraine—3.4%, NR—1.8%, CR—4.0%, SR—3.5%), much like in Poland (3.22%). Furthermore, it is the J2 subhaplogroup that is found more frequently, including Ukraine, where J2 is found almost exclusively.

As expected, from all of the neighboring countries, Germany is the one most distant from Poland in Y-haplogroup distribution. As observed by Kayser et al. (2005), the frequency of R1b is almost three times higher for Germany than for Poland (38.9 vs. 14.09%), the frequency of I—almost four times (23.6 vs. 6.02%), whereas R1a is found almost three times less frequently in Germany than in Poland (17.9 vs. 56.93%, respectively).

Intrapopulation Variability of Y Chromosome

Y-chromosome polymorphism analysis and both Y-SNP and Y-STR typing indicate that the Polish population is highly homogeneous both in terms of the entire country (Ploski et al., 2002) and separate regions (Pepinski et al., 2004a; Soltyszewski et al., 2007; Wozniak et al., 2007; Wolanska-Nowak et al., 2009). While the present study generally confirmed this result, it also allowed a more detailed insight at the diversity of the Polish population at the level of administrative units and clustered regions: the genetic information was related to place of residence, with participants from all voivodeships and the majority of counties; further testing was also facilitated by the use of clustering as an additional method of population grouping. A goal of the study was to see if a different result could be achieved by using a large set of data; examining a well-established representation of the entire Polish population and the use of regional clustering, we will get different result. Our findings indicate homogeneity with most variation occurring within populations at the voivodeship and cluster level: 99.25% for voivodeships and 98.73% for clusters. Only a small proportion of total variance was attributed to variation among groups in voivodeships (0.75%) and clusters (1.27%). This observation is consistent with Kayser et al. (2005), who reported 0.3% variability computed for Y chromosome SNPs.

The observed differences between the studies can be accounted for by differences in sample population number and profile. The present study was based on a data set comprising 2,705 individuals from all 16 voivodeships and 337 of the 380 counties, whereas the results of Kayser et al. (2005) were probably based on inhabitants of the selected cities in Poland (Wrocław, Warsaw, Lublin, Kraków, Bydgoszcz, Gdańsk, Szczecin, and Suwałki). Unfortunately, because of a lack of such studies, it is not possible to perform a detailed comparison of haplogroup frequencies for all voivodeships and counties.

Regarding the numbers of different haplogroups in voivodeships, the present findings correspond with the variability of mtDNA in the Polish population (Jarczak et al., 2019). In the earlier study, the Silesia voivodeship was indicated as the region with the greatest number of mtDNA haplogroups (19 of 21). A similar situation is observed in the present study: 10 of 11 total Y-chromosome haplogroups were found in individuals from Silesia. In contrast, Holy Cross voivodeship demonstrated the least variety, with only 10 mtDNA haplogroups. The differences shown in the present study are not so highlighted, with most voivodeships being characterized by six or seven haplogroups. The distribution and the frequency of haplogroups indicate that the Polish population is characterized by greater diversity in the case of mtDNA (Jarczak et al., 2019); several haplogroups were found to be present in the Polish population, with hg H demonstrating the highest frequency. Furthermore, four hgs (H, U, J, T) accounted for 82.38% of the studied population; however, many others prevalent in the European population (K, W, I, HV, V) were also observed. The Y-chromosome SNP analysis found R to be present in more than 71% of Polish males and, together with hg I, represents the vast majority of Y chromosome haplogroups (86.73%).

In contrast to previous studies, the present study examined a larger number of samples taken from individuals from all administrative regions of Poland and applied clustering as an additional method of grouping the populations. However, slight differences were observed between some studied regions according to the method of analysis. The Lodz voivodeship, for example, was found to be distinct from other voivodeships with regard to mtDNA variability (Jarczak et al., 2019). The historical basis for this variation is unclear: in contrast to West Pomerania and Warmia–Mazuria, Łódź, as a native voivodeship (excluding west part—see below), has not been the site of large-scale migration. Furthermore, MDS visualization indicated that almost all clusters were grouped together, indicating population homogeneity; however, clusters 12 (Bieszczady region), 14 (Słupsk region), 20 (Jelenia Góra, Bolesławiec, and Zgorzelec region), 28 (Wieluń, Częstochowa, and Lubliniec region), 30 (Mazury region), 32 (Konin, Kalisz, and Ostrów Wielkopolski region), and 35 (Włocławek and Kutno region) were distinct from this grouping, suggesting that genetic differences exist between their inhabitants.

The Bieszczady region, for example, is located in the southeastern part of Poland and is considered geographically distant from the rest of the country. It is characterized by one of the highest levels of forest cover in Poland and a lack of large urban centers. Furthermore, the region was historically affected by mass displacement of Lemkos and Ukrainians, with about 700,000 people having been displaced from the former Rzeszów voivodeship, particularly the counties of Lesko, Przemyśl, and Sanok: the Ukrainian people were moved to the east, whereas the Lemkos mainly settled the Lower Silesia and Masuria, which were granted to Poland after WWII. The Bieszczady region itself was resettled from the late 1950s (Ociepka, 2001).

Cluster 30, which corresponds to the Mazury region, has a different history to Bieszczady but was also a site of mass resettlement. Before the WWII, the region was part of German East Prussia; however, from 1946 to the 1970s, the Masurians inhabitants migrated to Germany and were replaced by people from other regions of Poland, such as those resettled from the Bieszczady region.

In the case of clusters 20, 28, 30, and 32, however, the historical explanation for their separation based on demographic processes is unclear. There are some historical justifications, such as the complete removal of at least 250,000 native Polish citizens, and their replacement by German citizens mostly from the Baltic region, i.e., the Reich District Land of the Warta river (Ger. Der Reichsgau Wartheland) (Eberhardt, 2000). The Warta river land covered a vast area from Poznań in the west, through the Kalisz region to Lodz in the east, and reaching as far as Inowrocław in the north, which more or less corresponds to the areas covered by cluster no. 32.

Interestingly, while previous analyses based on mtDNA variability (Jarczak et al., 2019) generally identify different regions as being genetically distinct, some similarities between the studies are visible. The region of Western Kuyavia (cluster no. 47 in the cited study) seems to be comparable to cluster 32, at least in some counties, in that it was also found to be genetically distinct. In addition, the previous study based on mtDNA variation indicated the Mazuria region (cluster no. 49 in the cited study) to be genetically distinct, and the present study found its analogous cluster to be the same (no. 30). However, it is not possible to make a full and accurate comparison between the two studies because of different number of clusters.

The interpolation maps were used to visualize regional differences between observed frequencies of hgs in Poland. As shown in Figure 1, haplogroup R1a is distributed mostly in the center part of Poland with a few regions on the west and east of the country. Interestingly, R1a was also found to be present in high numbers in eastern regions, including the Podlaskie and Warmian–Mazurian voivodeships, as well as almost all of the Lublin voivodeship; similar results were also obtained from central regions and Western Pomerania, which may have some historical basis. In contrast, R1b was more widely distributed, reaching farther east and west than the others; however, it is observed at relatively low frequencies in regions adjacent to the western and eastern borders of Poland. Such a pattern of distribution of hg R in the Polish population can reflect some historical events such as massive human migrations or the changes in the territorial borders.

A similar situation was observed in the case of hg I, whose distribution also followed geographic lines and possibly historical events. Haplogroup I is found to be represented mostly in western Poland and some region of eastern Poland, mostly in the Podlaskie and Lublin voivodeships, but also reaches the eastern parts of Mazovia, the western parts of Warmian–Mazurian, and almost all of Subcarpathia, which makes these regions similar to the west in terms of haplogroup frequency.

Interestingly, in the case of hg N, the Podlaskie voivodeship is distinct from the remaining voivodeships: as it was mentioned above, the frequency of hg N, which is common among the populations of Lithuania (Kasperaviciute et al., 2004) and Latvia (Pliss et al., 2015) and other inhabitants of northeast Europe, is 14.55% in this area and brings Podlaskie closer to the northern regions in this regard. In contrast, hg E displays much greater homogeneity across the map, with fewer marked differences between regions.

The comprehensive analysis of Y-chromosome variability described in the present study, i.e., based on the data from 2,705 individuals, including those from all voivodeships and most counties, and employing clustering as an additional method of population grouping, is the first of its type to be performed on the population of Poland. The findings confirm that the Polish population is characterized by a high degree of homogeneity, with only slight genetic differences being observed at the regional level. The use of regional clustering as an alternative to counties and voivodeships provided a more detailed view of the genetic structure of the population; the cluster analysis also identified any misleading differences observed between voivodeships.

Such a broad genetic analysis of Polish population should be able to give insights into the history of different regions of the country, especially given the individuals studied were asked to include information concerning their ancestry. The quality of answers given was, however, less than satisfactory, and so no conclusions can be drawn, because the history of the paternal line of those people remains unknown. It seems the only way to pursue the search for local history is to study populations with regard to even three-generations-down worth of genealogy knowledge, as shown by Rebala et al. (2013).

The results of the present study, together with previously published data about mtDNA variability, could serve as the basis for the further research into the connection between the modern and ancient times of Poland with regard to human migration and resettlement, as well as historical and cultural influences. Furthermore, regional differences identified by the mtDNA variability study and the present one highlight the need for additional division of the population by cultural and ethnic criteria in such studies rather than just by geographical or administrative regionalization. Representatives of ethnic (Karaites, Tatars), cultural (Kashubians, Kurpie, Podhale highlanders), and indigenous groups in specific regions of Poland should be included in future analyses.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://ega-archive.org/studies/EGAS00001004111.

Ethics Statement

The studies involving human participants were reviewed and approved by University of Lodz Ethics Review Board. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

DS conceptualized and supervised the study, provided the funding, organized and integrated the data. BM provided the funding, organized and integrated the data. ŁG, JJ, and MU performed bioinformatic analyses. MS-K and MSł performed microarray analysis. DS, JJ, WL, ŁG, PB, MSł, MS-K, MU, MSz, and AO analyzed the result of differences in haplogroups frequencies within Polish population. JJ, MSł, ŁG, WL, PB, AO, MSz, and DS drafted the manuscript. All authors contributed to the article and approved the submitted version.

Funding

The study was financed by Polish Ministry of Science and Higher Education no. DIR/WK/2017/01: “Biobank network in Poland, within the BBMRI-ERIC Research Infrastructure of Biobanks and Biomolecular Resources” and POPC.02.03.01-00-0012/17: “Digital sharing of biomolecular and descriptive resources of Biobank and Department of Anthropology, University of Lodz – characteristics of populations living in present-day Poland through the ages. Information platform e-Czlowiek.pl” (Operational Programme Digital Poland for 2014–2020). POPULOUS collection was financed by the Polish POIG Grant 01.01.02-10-005/08 TESTOPLEK from the European Regional Development Fund.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2020.567309/full#supplementary-material

References

Altena, E., Smeding, R., van der Gaag, K. J., Larmuseau, M. H. D., Decorte, R., Lao, O., et al. (2020). The Dutch Y-chromosomal landscape. Eur. J. Hum. Genet. 28, 287–299. doi: 10.1038/s41431-019-0496-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Behar, D. M., Metspalu, M., Baran, Y., Kopelman, N. M., Yunusbayev, B., Gladstein, A., et al. (2013). No evidence from genome-wide data of a Khazar origin for the Ashkenazi Jews. Hum. Biol. 85, 859–900. doi: 10.3378/027.085.0604

PubMed Abstract | CrossRef Full Text | Google Scholar

Branicki, W., Kalista, K., Kupiec, T., Wolanska-Nowak, P., Zoledziewska, M., and Lessig, R. (2005). Distribution of mtDNA haplogroups in a population sample from Poland. J. Forensic Sci. 50, 732–733.

Google Scholar

Czerniakiewicz, J. (1987). Repatriacja Ludności Polskiej z ZSRR 1944-1948. Warsaw: PWN.

Google Scholar

Diepenbroek, M., Cytacka, S., Szargut, M., Arciszewska, J., Zielinska, G., and Ossowski, A. (2019). Analysis of male specific region of the human Y chromosome sheds light on historical events in Nazi occupied eastern Poland. Int. J. Legal Med. 133, 395–409. doi: 10.1007/s00414-018-1943-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Dobrowolska, S., Michalska-Madej, J., Słomka, M., Sobalska-Kwapis, M., and Strapagiel, D. (2019). Biobank Łoìdzì® - population based biobank at the University of Łoìdzì, Poland. Eur. J. Transl. Clin. Med. 2, 85–95. doi: 10.31373/ejtcm/109495

CrossRef Full Text | Google Scholar

Eberhardt, P. (2000). Population Movements on the Territory of Poland Caused by the World War II. Warsaw: IGiPZ PAN.

Google Scholar

Excoffier, L., and Lischer, H. E. (2010). Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol. Ecol. Resour. 10, 564–567. doi: 10.1111/j.1755-0998.2010.02847.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Gill, P., Brenner, C., Brinkmann, B., Budowle, B., Carracedo, A., Jobling, M. A., et al. (2001). DNA commission of the international society of forensic genetics: recommendations on forensic analysis using Y-chromosome STRs. Forensic Sci. Int. 124, 5–10. doi: 10.1016/s0379-0738(01)00498-4

CrossRef Full Text | Google Scholar

Grzybowski, T., Malyarchuk, B. A., Derenko, M. V., Perkova, M. A., Bednarek, J., and Wozniak, M. (2007). Complex interactions of the Eastern and Western slavic populations with other European groups as revealed by mitochondrial DNA analysis. Forensic Sci. Int. Genet. 1, 141–147. doi: 10.1016/j.fsigen.2007.01.010

PubMed Abstract | CrossRef Full Text | Google Scholar

Hryciuk, G., Ruchniewicz, M., Szaynok, B., and żbikowski, A. (2008). Wysiedlenia, Wypêdzenia i Ucieczki 1939-1959: Atlas Ziem Polski. Warsaw: Demart SA.

Google Scholar

Janica, J., Pepinski, W., Niemcunowicz-Janica, A., Skawronska, M., Aleksandrowicz-Bukin, M., Ptaszynska-Sarosiek, I., et al. (2005). Y-chromosome STR haplotypes and alleles in the ethnic group of Polish Tatars residing in the Northeastern Poland. Forensic Sci. Int. 150, 91–95. doi: 10.1016/j.forsciint.2004.08.012

CrossRef Full Text | Google Scholar

Janica, J., Pepinski, W., Niemcunowicz-Janica, A., Skawronska, M., Soltyszewski, I., and Berent, J. (2008). Ethnic variation and forensic usefulness of Y-STR loci in inhabitants of northeastern Poland. Arch. Med. Sadowej Kryminol. 58, 17–21.

Google Scholar

Janica, J., Pepinski, W., Skawronska, M., Niemcunowicz-Janica, A., Koc-Zurawska, E., and Soltyszewski, I. (2006). Polymorphism of four X-chromosomal STRs in a population sample of Belarusian minority residing in Podlasie (NE poland). Arch. Med. Sadowej Kryminol. 56, 232–235.

Google Scholar

Jarczak, J., Grochowalski, L., Marciniak, B., Lach, J., Slomka, M., Sobalska-Kwapis, M., et al. (2019). Mitochondrial DNA variability of the Polish population. Eur. J. Hum. Genet. 27, 1304–1314. doi: 10.1038/s41431-019-0381-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Juras, A., Dabert, M., Kushniarevich, A., Malmstrom, H., Raghavan, M., Kosicki, J. Z., et al. (2014). Ancient DNA reveals matrilineal continuity in present-day Poland over the last two millennia. PLoS One 9:e110839. doi: 10.1371/journal.pone.0110839

PubMed Abstract | CrossRef Full Text | Google Scholar

Kasperaviciute, D., Kucinskas, V., and Stoneking, M. (2004). Y chromosome and mitochondrial DNA variation in Lithuanians. Ann. Hum. Genet. 68(Pt 5), 438–452. doi: 10.1046/j.1529-8817.2003.00119.x

CrossRef Full Text | Google Scholar

Kayser, M., Lao, O., Anslinger, K., Augustin, C., Bargel, G., Edelmann, J., et al. (2005). Significant genetic differentiation between Poland and Germany follows present-day political borders, as revealed by Y-chromosome analysis. Hum. Genet. 117, 428–443. doi: 10.1007/s00439-005-1333-9

CrossRef Full Text | Google Scholar

Kersten, K. (1974). Repatriacja Ludności Polskiej po II Wojnie światowej (Studium Historyczne). Wrocław: Zakład Narodowy im. Ossolińskich.

Google Scholar

Kosiński, L. (1960). Pochodzenie Terytorialne Ludności Ziem Zachodnich w 1950. Warsaw: IGiZP.

Google Scholar

Kostrzewa, G., Broda, G., Konarzewska, M., Krajewki, P., and Ploski, R. (2013). Genetic polymorphism of human Y chromosome and risk factors for cardiovascular diseases: a study in WOBASZ cohort. PLoS One 8:e68155. doi: 10.1371/journal.pone.0068155

PubMed Abstract | CrossRef Full Text | Google Scholar

Lao, O., Lu, T. T., Nothnagel, M., Junge, O., Freitag-Wolf, S., Caliebe, A., et al. (2008). Correlation between genetic and geographic structure in Europe. Curr. Biol. 18, 1241–1248. doi: 10.1016/j.cub.2008.07.049

PubMed Abstract | CrossRef Full Text | Google Scholar

Latuch, M. (1994). Repatriacja Ludności Polskiej w Latach 1955-1960 na tle Zewnêtrznych Ruchów Wêdrówkowych. Warsaw: PTD.

Google Scholar

Lessig, R., Edelmann, J., and Krawczak, M. (2001). Population genetics of Y-chromosomal microsatellites in Baltic males. Forensic Sci. Int. 118, 153–157. doi: 10.1016/s0379-0738(01)00384-x

CrossRef Full Text | Google Scholar

Lessig, R., Edelmann, J., Thiele, K., Kozhemyako, V., Jonkisz, A., and Dobosz, T. (2008). Results of Y-SNP typing in three different populations. Forensic Sci. Intern. Genet. Suppl. Ser. 1, 219–221. doi: 10.1016/j.fsigss.2007.10.122

CrossRef Full Text | Google Scholar

Maliarczuk, B. A., and Derenko, M. (2008). Gene pool structure of Russian populations from the European part of Russia inferred from the data on Y chromosome haplogroups distribution. Genetika 44, 226–231.

Google Scholar

Malyarchuk, B., Grzybowski, T., Derenko, M., Perkova, M., Vanecek, T., Lazur, J., et al. (2008). Mitochondrial DNA phylogeny in Eastern and Western Slavs. Mol. Biol. Evol. 25, 1651–1658. doi: 10.1093/molbev/msn114

PubMed Abstract | CrossRef Full Text | Google Scholar

Malyarchuk, B. A., Rogozin, I. B., Berikov, V. B., and Derenko, M. V. (2002). Analysis of phylogenetically reconstructed mutational spectra in human mitochondrial DNA control region. Hum. Genet. 111, 46–53. doi: 10.1007/s00439-002-0740-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Mielnik-Sikorska, M., Daca, P., Malyarchuk, B., Derenko, M., Skonieczna, K., Perkova, M., et al. (2013a). The history of Slavs inferred from complete mitochondrial genome sequences. PLoS One 8:e54360. doi: 10.1371/journal.pone.0054360

PubMed Abstract | CrossRef Full Text | Google Scholar

Mielnik-Sikorska, M., Daca, P., Wozniak, M., Malyarchuk, B. A., Bednarek, J., Dobosz, T., et al. (2013b). Genetic data from Y chromosome STR and SNP loci in Ukrainian population. Forensic Sci. Int. Genet. 7, 200–203. doi: 10.1016/j.fsigen.2012.05.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Ociepka, B. (2001). Deportacje, Wysiedlenia, Przesiedlenia - Powojenne Migracje z Polski i do Polski. Poznań: Instytut Zachodni.

Google Scholar

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: machine Learning in Python. J. Mach. Learn. Res. 12, 2825–2830.

Google Scholar

Pepinski, W., Janica, J., Skawronska, M., Niemcunowicz-Janica, A., and Soltyszewski, I. (2001). Population genetics of 15 STR loci in the population of Podlasie (NE Poland). Forensic Sci. Int. 124, 226–227. doi: 10.1016/s0379-0738(01)00603-x

CrossRef Full Text | Google Scholar

Pepinski, W., Niemcunowicz-Janica, A., Ptaszynska-Sarosiek, I., Skawronska, M., Koc-Zorawska, E., Janica, J., et al. (2004a). Population genetics of Y-chromosome STRs in a population of Podlasie, Northeastern Poland. Forensic Sci. Int. 144, 77–82. doi: 10.1016/j.forsciint.2004.02.024

CrossRef Full Text | Google Scholar

Pepinski, W., Niemcunowicz-Janica, A., Skawronska, M., Koc-Zorawska, E., Janica, J., and Soltyszewski, I. (2004b). Allele distribution of 15 STR loci in a population sample of Byelorussian minority residing in the northeastern Poland. Forensic Sci. Int. 139, 265–267. doi: 10.1016/j.forsciint.2003.11.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Pepinski, W., Niemcunowicz-Janica, A., Skawronska, M., Koc-Zorawska, E., Janica, J., and Soltyszewski, I. (2004c). Allele distribution of 15 STR loci in a population sample of the Lithuanian minority residing in the Northeastern Poland. Forensic Sci. Int. 144, 65–67. doi: 10.1016/j.forsciint.2004.01.023

PubMed Abstract | CrossRef Full Text | Google Scholar

Pepinski, W., Niemcunowicz-Janica, A., Skawronska, M., Janica, J., Koc-Zorawska, E., Aleksandrowicz-Bukin, M., et al. (2005a). Genetic data on 15 STR loci in the ethnic group of Polish Tatars residing in the area of Podlasie (Northeastern Poland). Forensic Sci. Int. 149, 263–265. doi: 10.1016/j.forsciint.2004.07.009

CrossRef Full Text | Google Scholar

Pepinski, W., Niemcunowicz-Janica, A., Skawronska, M., Janica, J., Koc-Zorawska, E., and Soltyszewski, I. (2005b). Genetic data on 15 STRs in a population sample of religious minority of Old believers residing in the northeastern Poland. Forensic Sci. Int. 148, 61–63. doi: 10.1016/j.forsciint.2004.04.010

PubMed Abstract | CrossRef Full Text | Google Scholar

Petrejcikova, E., Sotak, M., Bernasovska, J., Bernasovsky, I., Sovicova, A., Bozikova, A., et al. (2010). The genetic structure of the Slovak population revealed by Y-chromosome polymorphisms. Anthropol. Sci. 118:ase.090203. doi: 10.1537/ase.090203

CrossRef Full Text | Google Scholar

Piesowicz, K. (1988). Wielkie ruchy migracyjne w latach 1945-1950. Czêśæ I Stud. Demograficzne 4:96.

Google Scholar

Pliss, L., Timsa, L., Rootsi, S., Tambets, K., Pelnena, I., Zole, E., et al. (2015). Y-chromosomal lineages of latvians in the context of the genetic variation of the eastern-baltic region. Ann. Hum. Genet. 79, 418–430. doi: 10.1111/ahg.12130

PubMed Abstract | CrossRef Full Text | Google Scholar

Ploski, R., Wozniak, M., Pawlowski, R., Monies, D. M., Branicki, W., Kupiec, T., et al. (2002). Homogeneity and distinctiveness of Polish paternal lineages revealed by Y chromosome microsatellite haplotype analysis. Hum. Genet. 110, 592–600. doi: 10.1007/s00439-002-0728-720

CrossRef Full Text | Google Scholar

Polish Ministry of Information (1941). Concise Statistical Year-Book of Poland: September 1939 – June 1941. London: Statistics Poland.

Google Scholar

Polish War Reparations Bureau (1947). Sprawozdanie w Przedmiocie Strat i Szkód Wojennych Polski w Latach 1939-1945. Warsaw: Polish War Reparations Bureau.

Google Scholar

Poznik, G. D. (2016). Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men. bioarXiv [Preprint], doi: 10.1101/088716

CrossRef Full Text | Google Scholar

Rebala, K., Martinez-Cruz, B., Tonjes, A., Kovacs, P., Stumvoll, M., Lindner, I., et al. (2013). Contemporary paternal genetic landscape of Polish and German populations: from early medieval Slavic expansion to post-World War II resettlements. Eur. J. Hum. Genet. 21, 415–422. doi: 10.1038/ejhg.2012.190

PubMed Abstract | CrossRef Full Text | Google Scholar

Rebala, K., Mikulich, A. I., Tsybovsky, I. S., Sivakova, D., Dzupinkova, Z., Szczerkowska-Dobosz, A., et al. (2007). Y-STR variation among Slavs: evidence for the Slavic homeland in the middle Dnieper basin. J. Hum. Genet. 52, 406–414. doi: 10.1007/s10038-007-0125-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Rebala, K., and Szczerkowska, Z. (2004). Identification of a very short YCAII allele in the northern Polish population. Arch. Med. Sadowej Kryminol. 54, 17–24.

Google Scholar

Rebala, K., and Szczerkowska, Z. (2005). Polish population study on Y chromosome haplotypes defined by 18 STR loci. Int. J. Legal Med. 119, 303–305. doi: 10.1007/s00414-005-0547-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Rosser, Z. H., Zerjal, T., Hurles, M. E., Adojaan, M., Alavantic, D., Amorim, A., et al. (2000). Y-chromosomal diversity in Europe is clinal and influenced primarily by geography, rather than by language. Am. J. Hum. Genet. 67, 1526–1543. doi: 10.1086/316890

CrossRef Full Text | Google Scholar

Soltyszewski, I., Pepinski, W., Spolnicka, M., Kartasinska, E., Konarzewska, M., and Janica, J. (2007). Y-chromosomal haplotypes for the AmpFlSTR Yfiler PCR Amplification Kit in a population sample from Central Poland. Forensic Sci. Int. 168, 61–67. doi: 10.1016/j.forsciint.2006.01.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Soltyszewski, I., Plocienniczak, A., Fabricius, H. A., Kornienko, I., Vodolazhsky, D., Parson, W., et al. (2008). Analysis of forensically used autosomal short tandem repeat markers in Polish and neighboring populations. Forensic Sci. Int. Genet. 2, 205–211. doi: 10.1016/j.fsigen.2008.02.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Spolnicka, M., Dabrowska, J., Szablowska-Gnap, E., Paleczka, A., Jablonska, M., Zbiec-Piekarska, R., et al. (2017). Intra- and inter-population analysis of haplotype diversity in Yfiler((R)) Plus system using a wide set of representative data from Polish population. Forensic Sci. Int. Genet. 28, e22–e25. doi: 10.1016/j.fsigen.2017.01.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Strapagiel, D., Sobalska-Kwapis, M., Słomka, M., and Marciniak, B. (2016). Biobank Lodz - DNA Based Biobank at the University of Lodz, Poland. Open J. Bioresour. 3:e6.

Google Scholar

Tam, V., Patel, N., Turcotte, M., Bosse, Y., Pare, G., and Meyre, D. (2019). Benefits and limitations of genome-wide association studies. Nat. Rev. Genet. 20, 467–484. doi: 10.1038/s41576-019-0127-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Trzeciecki, M. (2016). The Past Societies. Vol. 5. 500AD - 1000AD. Warsaw: Institute of Archaeology and Ethnology.

Google Scholar

Wang, J., Samuels, D. C., Shyr, Y., and Guo, Y. (2017). StrandScript: evaluation of Illumina genotyping array design and strand correction. Bioinformatics 33, 2399–2401. doi: 10.1093/bioinformatics/btx186

PubMed Abstract | CrossRef Full Text | Google Scholar

Wolanska-Nowak, P., Branicki, W., Parys-Proszek, A., and Kupiec, T. (2009). A population data for 17 Y-chromosome STR loci in South Poland population sample–some DYS458.2 variants uncovered and sequenced. Forensic Sci. Int. Genet. 4, e43–e44. doi: 10.1016/j.fsigen.2009.04.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Wozniak, M., Grzybowski, T., Starzynski, J., and Marciniak, T. (2007). Continuity of Y chromosome haplotypes in the population of Southern Poland before and after the Second World War. Forensic Sci. Int. Genet. 1, 134–140. doi: 10.1016/j.fsigen.2007.01.003

CrossRef Full Text | Google Scholar

Wozniak, M., Malyarchuk, B., Derenko, M., Vanecek, T., Lazur, J., Gomolcak, P., et al. (2010). Similarities and distinctions in Y chromosome gene pool of Western Slavs. Am. J. Phys. Anthropol. 142, 540–548. doi: 10.1002/ajpa.21253

PubMed Abstract | CrossRef Full Text | Google Scholar

Zastera, J., Roewer, L., Willuweit, S., Sekerka, P., Benesova, L., and Minarik, M. (2010). Assembly of a large Y-STR haplotype database for the Czech population and investigation of its substructure. Forensic Sci. Int. Genet. 4, e75–e78. doi: 10.1016/j.fsigen.2009.06.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Zupan, A., Vrabec, K., and Glavac, D. (2013). The paternal perspective of the Slovenian population and its relationship with other populations. Ann. Hum. Biol. 40, 515–526. doi: 10.3109/03014460.2013.813584

PubMed Abstract | CrossRef Full Text | Google Scholar

Appendix A: Supplemental Data

Web Resources

BBMRI-ERIC Directory, https://directory.bbmri-eric.eu/

Python, https://www.python.org/

Scikit-learn, https://scikit-learn.org/sta

QGIS, http://qgis.org

Geodesic and Cartographic Documentation Center, https://gis-support.com/spatial-datasets-for-poland/

Google Maps Api, https://developers.google.com/maps

GenomeStudio, https://www.illumina.com/techniques/microarrays/array-data-analysis-experimental-design/genomestudio.html

StrandScript, https://github.com/seasky002002/Strandscript

European Genotype Archive, https://www.ebi.ac.uk/ega/

yHaplo, https://github.com/23andMe/yhaplo

International Society of Genetic Genealogy. Y-DNA Haplogroup Tree 2016, http://www.isogg.org/tree/

Arlequin, http://cmpg.unibe.ch/software/arlequin35/

R, https://www.r-project.org/.

Keywords: Y-chromosome, haplogroups, Polish population, regions of Poland, microarray analysis, SNPs

Citation: Grochowalski Ł, Jarczak J, Urbanowicz M, Słomka M, Szargut M, Borówka P, Sobalska-Kwapis M, Marciniak B, Ossowski A, Lorkiewicz W and Strapagiel D (2020) Y-Chromosome Genetic Analysis of Modern Polish Population. Front. Genet. 11:567309. doi: 10.3389/fgene.2020.567309

Received: 29 May 2020; Accepted: 27 August 2020;
Published: 23 October 2020.

Edited by:

Fulvio Cruciani, Sapienza University of Rome, Italy

Reviewed by:

Hui Li, Fudan University, China
Damir Marjanovic, Institute for Anthropological Research, Croatia

Copyright © 2020 Grochowalski, Jarczak, Urbanowicz, Słomka, Szargut, Borówka, Sobalska-Kwapis, Marciniak, Ossowski, Lorkiewicz and Strapagiel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Dominik Strapagiel, dominik.strapagiel@biol.uni.lodz.pl

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.