Edited by: Fulvio Cruciani, Sapienza University of Rome, Italy
Reviewed by: Horolma Pamjav, Bűnügyi Szakértői és Kutatóintézet, Hungary; Hovirag Lancioni, University of Perugia, Italy; Antonio González-Martín, Complutense University of Madrid, Spain
This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
The Slovenian territory played a crucial role in the past serving as gateway for several human migrations. Previous studies used Slovenians as a source population to interpret different demographic events happened in Europe but not much is known about the genetic background and the demographic history of this population. Here, we analyzed genome-wide data from 96 individuals to shed light on the genetic role and history of the Slovenian population. Y chromosome diversity splits into two major haplogroups R1b and R1a with the latter suggesting a genetic contribution from the steppe. Slovenian individuals are more closely related to Northern and Eastern European populations than Southern European populations even though they are geographically closer. This pattern is confirmed by an admixture and clustering analysis. We also identified a single stream of admixture events between the Slovenians with Sardinians and Russians around ∼2630 BCE (2149-3112). Using ancient samples, we found a significant admixture in Slovenians using Yamnaya and the early Neolithic Hungarians as sources, dated around ∼1762 BCE (1099-2426) suggesting a strong contribution from the steppe to the foundation of the observed modern genetic diversity. Finally, we looked for signals of selection in candidate variants and we found significant hits in
The Slovenian territory is geographically located between the Alps, the Adriatic Sea and the Pannonian basin and as such it could have been used as a gateway for different populations over several periods of time. However, the presence of geographical and possibly cultural barriers could have led to a more puzzling role for this region. The territory of modern day Slovenia was settled during the 6th and 7th centuries AD by different Slavic tribes from at least two different directions: one from the north and one from the east (
Previous genetic studies were only based on Y chromosome variation (
All these elements highlight the necessity of a broader and more comprehensive genetic study on Slovenian population. Outcomes of this kind of study could be useful to several disciplines and not only for population genetics and linguistics. For example, in the context of genetic epidemiology, describing the genetic landscape of this population would be beneficial to understand the genetic background of disease-related loci and their distribution; in fact Slovenians show evidence of familial hypercholesterolemia (3.1%) (
Therefore, our aims are to provide (i) a detailed characterization of the genetic structure of Slovenians in a broader European context using both Y chromosome and autosomal data, (ii) a description of the past and present admixture pattern, and (iii) a survey of variants putatively under selection and associated with different traits/diseases. For these purposes, we used genotype data from 96 Slovenian individuals and we analyzed them together with previously published modern and ancient samples.
Overall, 96 samples ranging from Slovenian littoral to Lower Styria were genotyped for 713,599 markers using the OmniExpress 24-V1 BeadChips (Figure
Geographical location of the samples included in this study. Sample size is reported next to each sampling location together with the region ID. Map has been modified from
First, 52 male samples were extracted from the dataset and Y chromosome haplogroups were assigned using AMY-tree v2.0 software (
Principal Component Analysis (PCA) was performed using the option –pca in PLINK v1.9 (
We inferred the ancestral structure using ADMIXTURE v1.30 (
Analyses looking for signals of selection were performed using PCAdapt implemented in the R package PCAdapt (
First, Y chromosome genetic diversity was assessed. A total of 52 Y chromosomes were analyzed for 195 SNPs. The majority of individuals (25, 48.1%) belong to the haplogroup R1a1a1a (R-M417) while the second major haplogroup is represented by R1b (R-M343) including 15 individuals (28.8%). Twelve samples are assigned to haplogroup I (I M170): five and two samples belong to haplogroup I2a (I L460) and I1 (I M253), respectively, while the remaining five samples did not have enough information to be further assigned. We then performed principal component analysis on autosomal data to further investigate the presence of structure within the Slovenian population. We did not find any clusters even when samples were highlighted by the region of origin (Supplementary Figure
A principal component analysis was then performed on the Slovenian_HO dataset including 127,385 SNPs. The first two PCs explained ∼63% of the variance and PC1 divides the African samples while PC2 highlights the Asian cluster (Supplementary Figure
Population structure of Slovenian samples.
The presence of genetic clusters within the Slovenia_HO_EU dataset has been further investigated using “Mclust” which suggests
The relationships between populations were also assessed by computing a pairwise Fst matrix. Analysis of the UPGMA tree based on the Fst matrix shows all Slovenian individuals clustering together with Hungarians, Czechs, Croatians, Ukrainians, and Belarusians (Supplementary Figure
Pattern of runs of homozygosity computed on the Slovenian population does not differ significantly from Hungarians, Czechs, Croatians (Mann Whitney
All Slovenian individuals share common pattern of genetic ancestry, as revealed by ADMIXTURE analysis. The three major ancestry components are the North East and North West European ones (light blue and dark blue, respectively, Figure
Unsupervised admixture analysis of Slovenians. Results for
Using ALDER, the most significant admixture event was obtained with Russians and Sardinians as source populations and it happened 135 ± 9.31 generations ago (
Admixture events identified with ALDER and MALDER. The gray dots represent significant admixture events detected with ALDER using Slovenians as target, the solid line represents the single admixture event detected using MALDER, dashed lines represent the confidence interval. Only the significant results after multiple testing correction are plotted. For ALDER results see Supplementary Table
We then modeled the Slovenian population as target of admixture of ancient individuals from
Considering the pattern of admixture in Slovenian population, we searched for highly differentiated loci between Slovenians and specific reference populations. By comparing Russians and Slovenians we found that rs4129267 on
GWAS-SNP with FDR < 0.1 in selection scan.
Chr | Position (bp) | Gene | SNP | Comparison |
---|---|---|---|---|
1 | 154453788 | IL6R | rs4129267 | SLO vs. RUS |
7 | 99642556 | ZSCAN25 | rs10242455 | SLO vs. GRE |
10 | 100315722 | PKD2L1 | rs603424 | SLO vs. RUS |
11 | 61803311 | FADS1 | rs174547 | SLO vs. GRE |
11 | 61804006 | FADS1 | rs174550 | SLO vs. GRE |
11 | 61830500 | FADS2 | rs1535 | SLO vs. GRE |
14 | 92460608 | SLC24A4 | rs10498633 | SLO vs. ITN |
15 | 28120472 | HERC2 | rs12913832 | SLO vs. GRE |
SLO vs. SAR | ||||
SLO vs. SPA | ||||
16 | 54455881 | LOC105371272/73 | rs2388639 | SLO vs. ITN |
19 | 32873722 | SLC7A9 – CEP89 | rs8101881 | SLO vs. ITN |
21 | 41211811 | BACE2 | rs6517656 | SLO vs. SPA |
SLO vs. ITN | ||||
Slovenia presents a peculiar landscape, composed by mountain regions on the North West changing to flat lands toward the East. The border with the West not only represents a geographical barrier but also a linguistic one. To date no previous studies have described the genetic variation in Slovenian population. We considered different measurements of population admixture, isolation, and selection. In this study, we addressed the genetic features of Slovenian genomes and how admixture and selection shaped the genetic diversity of this population. Analysis of Y chromosome variation showed a presence of two main haplogroups R1b and R1a, confirming previous findings and suggesting gene flow from the Steppe (
From autosomal data our analyses on population structure revealed the absence of strong substructure within the Slovenian samples, although the samples came from different regions of the country. We discovered a strong affinity between Slovenians and Central-Eastern European populations such as Czechs and Hungarians. Slovenians are closer to North European samples respect to South European ones including the neighboring North Italian population. Our findings suggest that the Slovenian ancestry seems more closely related to population from Northern-Central Europe, respect to Western-South Europe.
For purpose of further studies focused on genetic epidemiology, our analyses show that Slovenians have no evidences of isolation. Nevertheless, we found a specific pattern of ROH hotspot in our Slovenian samples, despite the limitation of the possibility to replicate this pattern in other populations of the dataset due to the difference of sample size. Some of these regions contains interesting genes linked to GWAS traits including Type 1 Diabetes that should be further investigated.
Our analyses of admixture events using methods based on LD decay revealed that the modern Slovenian gene pool could be explained by several admixture events that happened in a single window of time. We also showed that that there is no support for multiple admixture events across time. Overall, the Slovenian genetic pool seems to have been formed during the Bronze Age period as admixture between North-Eastern European populations and Near-Eastern populations as proxy. This pattern has been further confirmed when we used ancient genomes. Specifically, we obtained the strongest signal for Slovenians using as references Yamnaya and Hungary Early Neolithic samples. The estimated admixture time falls within the range of that one obtained using modern populations. We can conclude that populations closely related to Yamnaya and early Neolithic Hungarians contributed during the Bronze age to the foundation of the modern genetic variability in Slovenians. We could make the hypothesis that also disease and specific traits-alleles were likely to have been introduced in the Slovenian genetic pool during this period, such as pigmentation alleles (including the high frequency of blue eyes alleles found in Slovenian samples), lactose tolerance (rs4988235, whose frequency in Slovenia is 0.36) and also immune related alleles such as rs4833095 in TLR1 (Slovenian derived allele frequency of 0.7, Bronze Age ∼0.8).
Considering the discovered admixture pattern that contribute to the actual genetic diversity of this populations, we analyzed the possible selection signals in a broader context. We specifically focused on putatively selected variants in this population that could be useful for genetic epidemiology and to better understanding the forces shaping the genetic diversity in the Slovenians. Our selection study revealed significant hits on markers associated mainly on lipid traits and eye pigmentation when compared to South Europeans (Greeks) such as
On the other hand, when we compared Slovenians with North-Eastern populations, such as Russians, Slovenians showed signature of selection in
One limitation of our study is the use of only SNP-chip data, future studies involving whole-genome sequencing would highlight more details the genetic features of the genes under selection, also enhancing the power to detect putatively deleterious rare variants.
MM and PM designed the project, performed the analyses, and wrote the manuscript. MR-G and DG collected the samples. PG, MR-G participated in project coordination and helped to draft the manuscript. All authors read and approved the final manuscript.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We are grateful to the genotoul bioinformatics platform Toulouse Midi-Pyrenees (Bioinfo Genotoul) for providing computing resources (www.bioinfo.genotoul.fr). We wish to acknowledge the DJEI/DES/SFI/HEA Irish Centre for High-End Computing (ICHEC) for the provision of computational facilities and support.
The Supplementary Material for this article can be found online at: