Genetic diversity and breed-informative SNPs identification in domestic pig populations using coding SNPs

Background: The use of breed-informative genetic markers, specifically coding Single Nucleotide Polymorphisms (SNPs), is crucial for breed traceability, authentication of meat and dairy products, and the preservation and improvement of pig breeds. By identifying breed informative markers, we aimed to gain insights into the genetic mechanisms that influence production traits, enabling informed decisions in animal management and promoting sustainable pig production to meet the growing demand for animal products. Methods: Our dataset consists of 300 coding SNPs genotyped from three Italian commercial pig populations: Landrace, Yorkshire, and Duroc. Firstly, we analyzed the genetic diversity among the populations. Then, we applied a discriminant analysis of principal components to identify the most informative SNPs for discriminating between these populations. Lastly, we conducted a functional enrichment analysis to identify the most enriched pathways related to the genetic variation observed in the pig populations. Results: The alpha diversity indexes revealed a high genetic diversity within the three breeds. The higher proportion of observed heterozygosity than expected revealed an excess of heterozygotes in the populations that was supported by negative values of the fixation index (FIS) and deviations from the Hardy-Weinberg equilibrium. The Euclidean distance, the pairwise FST, and the pairwise Nei’s GST genetic distances revealed that Yorkshire and Landrace breeds are genetically the closest, with distance values of 2.242, 0.029, and 0.033, respectively. Conversely, Landrace and Duroc breeds showed the highest genetic divergence, with distance values of 2.815, 0.048, and 0.052, respectively. We identified 28 significant SNPs that are related to phenotypic traits and these SNPs were able to differentiate between the pig breeds with high accuracy. The Functional Enrichment Analysis of the informative SNPs highlighted biological functions related to DNA packaging, chromatin integrity, and the preparation of DNA into higher-order structures. Conclusion: Our study sheds light on the genetic underpinnings of phenotypic variation among three Italian pig breeds, offering potential insights into the mechanisms driving breed differentiation. By prioritizing breed-specific coding SNPs, our approach enables a more focused analysis of specific genomic regions relevant to the research question compared to analyzing the entire genome.


Introduction
The domestic pig is an important livestock animal that is widely used for red meat, lard, and cured goods.It is a key player in the meat industry, particularly in Europe (OECD, 2022).Previous studies have suggested that the European domestic pig (Sus scrofa domesticus) is primarily descended from European wild boars (Giuffra et al., 2000).However, recent research has challenged this notion by identifying Asian mitochondrial DNA (mtDNA) haplotypes in European Yorkshire, Duroc, and Landrace pigs.This finding suggests that there may have been some interbreeding or genetic exchange between the two populations in the past (Giuffra et al., 2000;Larson et al., 2005).Throughout history, Italy has developed various breeds of pigs, each with unique characteristics and uses, such as Cinta Senese (Tuscany region), Nero Siciliano (Sicily region), and Mora Romagnola (Emilia-Romagna region) (Franci and Pugliese, 2007).The Yorkshire breed is one of the most commonly used commercial pig breeds and was introduced to Italy in the early 20th century due to its fast growth rate and high efficiency in converting feed into meat.The Landrace breed was introduced to Italy in the mid-20th century and has since been utilized in industrial pork production.The Duroc breed originated in the United States in the 19th century and has been exported to many countries, including Italy.This breed is often used in crossbreeding programs to produce hybrid pigs with desirable traits such as meat quality and growth rate (https:// www.thepigsite.com/).
Both genetic and environmental factors have an impact on the phenotypic characteristics of commercial pig breeds, such as meat quality and disease resistance (Rosenvold and Andersen, 2003).Therefore, understanding the genetic diversity of these breeds is crucial for enhancing animal production, conserving animal genetic resources, and evaluating breed performance (Bovo et al., 2020;Dadousis et al., 2022).This research can help find breeds with better phenotypic traits and the ability to adapt to difficult conditions (Bovo et al., 2020).It can also support the sustainable growth of animal production in different settings and make it easier to reach evolutionary breeding goals rapidly (Notter, 1999).
The use of genome-wide panels of single nucleotide polymorphisms (SNPs) has transformed the study of pig breeds by allowing for the examination of complex relationships among them (Muñoz et al., 2019).However, processing such vast amounts of data can be challenging, leading to the need for a more efficient approach.One potential solution is to create less dense panels using a smaller set of markers specific to each breed based on a reduced number of SNPs.This approach would require less time and effort for analysis, thus making it more feasible.Breed-specific SNPs are frequently used in conservation biology to manage and protect livestock resources (Ozerov et al., 2013;Huisman, 2017), as well as for breed identification and authentication of meat and dairy products (Russo et al., 2007;Fontanesi et al., 2010).
The use of breed-informative SNPs has shown promising results in improving desired traits in pig breeding programs.A recent study on Italian Yorkshire pigs found that selecting SNPs associated with production traits, such as lean meat content, daily gain, and feed/ gain ratio, can increase the frequency of desirable alleles over time, leading to faster improvement of these traits (Fontanesi et al., 2015).Genome-wide association studies (GWAS) have also become a popular way to find genetic variants linked to important production traits like meat and carcass quality, growth, and teat number in European pig breeds (Tang et al., 2019;Fabbri et al., 2020;Bovo et al., 2021).To identify breed-informative SNPs, various analytical tools, such as Random Forests, Principal Component Analysis, Regression, allele frequency differences, and Discriminant Analysis of Principal components, have been developed (Wilkinson et al., 2011;Schiavo et al., 2020;Hayah et al., 2021;Dadousis et al., 2022).These tools can help researchers identify key genetic markers and gain a deeper understanding of the genetic basis of production traits in pig breeds.
The aim of this study is to identify a breed-informative SNPs panel with high power to facilitate breed traceability and preservation efforts while also supporting breeding programs that prioritize desirable traits in these pig breeds.We anticipate that the identified SNPs will provide a useful tool for researchers and breeders alike, enabling them to make more informed decisions in animal management and breeding programs.By focusing on coding SNPs, we hope to identify genetic markers that are potentially functional, allowing for a better understanding of the underlying genetic mechanisms governing desirable production traits in commercial pig breeds.Ultimately, our research may contribute to the long-term sustainability of pig production, ensuring that we are able to meet the growing demand for animal products while preserving animal genetic diversity.

Source of data and SNP
The data utilized in this research is part of the MISAGEN project's preexisting database (Botti et al., 2006;Biffani et al., 2011).This initiative gathered and archived a comprehensive dataset including pedigree information, clinical symptomatology, and health-related phenotypes from a commercial pig breeding population, which was sampled in Northern Italy.The initial dataset contained records from 2908 weaning piglets representing four distinct breeds: Yorkshire, Landrace, Duroc, and Pietrain.DNA extraction was carried out using nasal swabs as the source material.The subsequently extracted DNA was subjected to genotyping procedures employing the Illumina PorcineSNP60 BeadChip, designed to target a broad spectrum of over 60,000 Single Nucleotide Polymorphisms (SNPs) distributed across the pig genome.

Quality control and SNP extraction
The genotyped data underwent rigorous quality control utilizing the quality control module within the GenABEL package of the R statistical software (Aulchenko et al., 2007) After applying these filters, a total of 14,967 SNPs (24.8% of the available 60,123 SNPs) and 77 individuals (0.063% of the total) were excluded from the analysis.In this study, a set of 300 coding SNP were chosen considering their physical proximity to genes linked to pig immunity.Plink software (Purcell et al., 2007) was used to extract those 300 coding SNPs from the three distinct pig populations: Yorkshire (YO), Landrace (LA), and Duroc (DU).Each breed was represented by 100 animals, resulting in a total of 300 animals analyzed in the study.

Genetic diversity estimates
In this study, we used a range of genetic diversity metrics to analyze our dataset; all of the analyses were conducted in R software (R Core Team, 2020).All of the population genetics estimates reported in this work, including allele frequencies, expected (H E ) and observed (H O ) heterozygosity, the inbreeding coefficient (F IS ), alpha (α) diversity indexes, exact tests for Hardy-Weinberg Equilibrium (HWE), under selection variants, and fixed alleles, were implemented using the "dartR" package (Gruber et al., 2022) and its dependencies from R statistical software.The genetic distances between breeds were implemented using the "dartR" package (Gruber et al., 2022) and its dependencies from R statistical software.The graphics were created using the "ggplot2" and "Graphics" packages (Hadley, 2016;R Core Team, 2020).
H E , H O , and F IS were estimated according to Nei (Nei, 1987).Alpha diversity indexes for allelic richness (q = 0), Shannon information (q = 1), and heterozygosity (q = 2) were estimated according to Sherwin (Sherwin et al., 2017).The exact p-values for the HWE test were calculated using the method described by Wigginton (Wigginton et al., 2005), and the results were visualized using a ternary plot.We used the OutFlank method (Whitlock and Lotterhos, 2015) to find variants that were subject to selection pressures.This method involves figuring out the neutral fixation index (F ST ) distribution from the actual data and then centering the distribution by fitting it to a chi-square model.Loci with a p-value of less than 0.05 were considered F ST outliers and indicative of selection pressure.To estimate the pairwise F ST values for genetic distances between pig breeds, we used Weir and Cockerham update of Wright's approach (Wright, 1951;Weir and Cockerham, 1984), while we used Nei's approach (Nei, 1987) to estimate the pairwise G ST values for genetic distances between populations.

Discriminant analysis of principal components (DAPC)
Our study implemented the Discriminant Analysis of Principal Components method with a three-fold purpose.Our first objective was to assess the discriminatory power of individual SNPs in distinguishing the three breed clusters.We aimed to optimize the separation of individuals into predefined groups using discriminant functions of principal components by maximizing between-group diversity and minimizing within-group diversity.Our second objective was to investigate the genetic structure of the population, considering the existing knowledge about the pig breeds and their genetic variation.Finally, our third objective was to determine the probability of animals joining a particular population based on their genetic background.
After identifying SNPs of significant importance, we utilized the Variant Effect Predictor (VEP) tool from the Ensembl database (McLaren et al., 2016) to compare them with the "Pig Reference (Sus_scrofa)" database.This comparison aimed to uncover the genes and biological pathways associated with these SNPs.Additionally, we conducted a search in the "NCBI database" using the SNP marker names as keywords to investigate their involvement in biological processes.
To analyze the population structure, we employed the "adegenet" package in the R software (Jombart, 2008) to perform Discriminant Analysis of Principal Components.Subsequently, we employed the "pca3d" package (Weiner, 2020) to visualize how the most significant SNPs segregated individuals into different clusters.

Functional enrichment analysis (FEA) of the most discriminating SNPs between the pig breeds
To determine the crucial biological functions that differentiate our three pig breeds, we performed a Functional Enrichment Analysis on a gene list comprising the genes housing the most significant breed informative SNPs.We utilized the "gprofiler2" R package (Kolberg and Raudvere, 2021), which employs various databases such as the Gene Ontology (GO) database, Kyoto Encyclopedia of Genes and Genomes (KEGG), WikiPathways (WP), Human phenotype ontology (HP), and micro-RNA target (MIRNA) databases, among others.The gene list was automatically generated from our informative SNP set identifiers and served as the input for the "gost" function within the "gprofiler2" R package.This function conducts Functional Enrichment Analysis, utilizing the Gene Ontology database.Our analysis included a thorough statistical enrichment assessment using the hypergeometric test, and we applied multiple testing corrections to enhance result reliability.To minimize the potential for false positives, we established a user-defined threshold of 0.05.

Genetic diversity within population
The population sample shows a nearly equal proportion of the first and second alleles, with a slight preference towards the second allele (frequencies of 0.48 and 0.52, respectively).The observed proportion of heterozygotes in all three breeds is higher than expected, indicating a possible excess of heterozygotes.Our analysis of alpha diversity indexes reveals variability among different q-values, indicating a deviation from HWE.The average values of allelic richness, Shannon information, and heterozygosity are 2, 1.96, and 1.92, respectively (Figure 1).The negative value of the overall fixation index (F IS = −0.03)supports this deviation from HWE.We conducted statistical tests to identify loci that deviate from HWE, and 46 SNPs showed statistically significant deviations (see Supplementary Table S1).These deviations are primarily concentrated at the vertex that represents heterozygotes (AB).The results of the chisquare test for selection pressure suggest that there is no evidence of selection acting on any of the loci, and the absence of fixed alleles in any of the three breeds supports this conclusion.The exact p-values of the test of HWE deviations are reflected in a ternary plot (Figure 2), with significant deviations indicated by pink dots.The blue parabola represents the expected genotype frequencies under HWE, and the space between the green lines indicates deviations that are not statistically significant.

Genetic diversity/distance among the pig breeds
We used Euclidean distance, pairwise F ST , and pairwise Nei's G ST to look at the genetic differences between the three groups of pigs.The heat maps in Figure 3 show the results.The heat maps indicate genetic divergence in red and genetic similarity in blue.Our analysis showed that the LA and DU breeds are the most genetically different from each other.Their estimated Euclidean distances are 2.815, their pairwise F ST is 0.048, and Nei's pairwise G ST is 0.052, all of which show that they are very different genetically.Conversely, the YO and LA breeds were found to be the most genetically similar, with estimated Euclidean distances of 2.242, pairwise F ST of 0.029, and Nei's pairwise G ST of 0.033, indicating a close genetic relationship between these two breeds.

Discriminant analysis of principal components (DAPC) to explore the pig populations structure
To further explore the population structure, we generated a DAPC plot based on the first and second Principal Components (PCs) (Figure 4A).We used the alpha-score optimization method (Jombart and Collins, 2015) to determine the necessary number of PCs.The clusters in the DAPC plot were defined by prior knowledge of population membership (K = 6).We retained 30 PCs, explaining 40% of the overall genetic variability, as input to the Discriminant Analysis.
The DAPC plot showed clear clustering of individuals by breed, with the separation between breeds being more distinct in the first discriminant function (Figure 4B).The average assignment probability was 99% for DU and 100% for YO and LA breeds.We identified 28 SNPs that contributed most to breed differentiation based on a threshold of 0.01, and their names are listed in Supplementary Table S2.We performed a PCA on the 300-pig population using these 28 SNPs as variables, and the resulting plot showed clear clustering of individuals by breed (Figure 5).The reduced dataset's overall assignment probability was 74%, with YO breeds having the highest assignment rates (90%), LA breeds coming in second (73%), and DU breeds coming in third (60%).The assignment rate using the whole dataset was higher compared to using only the most contributing SNPs.However, it is worth noting that the assignment rate achieved using the most informative SNPs remained notably high, standing at no less than 60% (Figure 6).

Functional enrichment analysis (FEA) of the most discriminating SNPs between the pig breeds
The functional Enrichment Analysis of the genes harboring the most breed informative SNPs revealed three important biological functions: (1) nucleosome, (2) DNA packaging complex, and (3) structural component of chromatin (Figure 7).These functions are crucial for regulating gene expression and maintaining DNA's structural stability within the nucleus (Alberts et al., 2002).Nucleosomes are integral components of chromatin that organize and compact DNA into a condensed structure.The DNA packaging
complex plays a crucial role in assembling and disassembling nucleosomes and regulating chromatin structure and function.The structural constituents of chromatin provide mechanical support to the chromatin fiber, maintaining its integrity.Table 1 presents the short names of these functions and their corresponding p-values, sorted in decreasing order of significance following hypergeometric testing and multiple testing adjustments.

Discussion
Through our study, we have uncovered the genetic diversity present in three commercially important pig breeds, namely, Landrace, Yorkshire, and Duroc.These findings hold significant implications for breeding programs and conservation initiatives focused on preserving the genetic diversity within pig populations.
During our investigation, we observed notable genetic variability in our coding variants across the three breeds.Additionally, the Hardy-Weinberg equilibrium test revealed deviations from the expected population equilibrium.We also noted variations in the diversity q-values and an overall negative F IS value.The presence of an excess of heterozygosity in our dataset likely contributed to the observed HWE imbalance at 46 loci.It is noteworthy that our population does not appear to be subjected to selective pressure, and the deviations may be attributed to random mating among pig individuals, resulting in an isolate-breaking effect (Hamilton, 2021).The identification of informative SNPs, particularly those located in coding regions, is crucial for developing cost-effective SNP panels to facilitate efficient genotyping and breeding selection.This approach can improve the accuracy and effectiveness of pig breeding programs, leading to the development of more robust and productive pig breeds (Fontanesi et al., 2015).Investigating coding SNPs is important for preventing genetic diseases caused by mutations in specific genes.By identifying these mutations and integrating them into breeding programs, the prevalence of these diseases in pig populations can be reduced, resulting in improved animal welfare and decreased economic losses for farmers (Mellencamp et al., 2008).
Previous research has identified informative SNPs for differentiating among various species, including cattle breeds (Cheong et al., 2013;Zwane et al., 2016;Bertolini et al., 2018) as well as wild boars and domestic pigs (Lorenzini et al., 2020).While previous studies have focused on identifying informative SNPs among commercial pig breeds (YO, DU, and LA) using noncoding SNPs (Schiavo et al., 2020;Hayah et al., 2021), our study aimed to identify informative SNPs using only coding variants.
In our study, we found 28 genetic markers (SNPs) that help distinguish the three pig breeds.Of these, six specific markers did not match what we expected based on the Hardy-Weinberg test.The presence of these deviating SNPs highlights their importance as potential markers for distinguishing between the various pig breeds.However, it is essential to underscore that further comprehensive research and studies are imperative to validate and elucidate the precise roles and contributions of these SNPs in breed differentiation.
It is important to highlight that previous studies have already provided valuable insights into the implications of specific SNPs that we have identified in our research.For instance, a previous genomewide association study (Große-Brinkhaus et al., 2015) demonstrated a significant association between the SNP ALGA0039432 and boar taint as well as testes size parameters.This finding underscores the relevance of this particular SNP in relation to these specific traits.
Moreover, our analysis identified two SNPs, namely, ALGA0060925 and DRGA0005996, as key contributors to breed differentiation.ALGA0060925 is positioned downstream on chromosome 11 and is responsible for encoding a long non-coding RNA (lncRNA).In contrast, DRGA0005996 is located on SSC5 and corresponds to the CPNE8 gene, which is responsible for producing the copine-8 protein.
Copine-8 is a calcium-dependent phospholipid-binding molecule that plays a crucial role in calcium-mediated intracellular processes.It is worth noting that dysregulation of CPNE8, a member of the Copine family, has been associated with various diseases such as prion disease and gastric cancer in previous studies (Lloyd et al., 2013;Zhang et al., 2022).These findings suggest that CPNE8 may have multifaceted roles beyond breed differentiation and warrants further investigation in relation to its potential involvement in disease pathways.
Furthermore, several other SNPs within our dataset have been previously associated with various phenotypic traits.For example, the intergenic variant ASGA0077916 has demonstrated a significant correlation with the fatty acid composition of the Longissimus dorsi muscle (Sambache Tayupanta, 2016).Another SNP of interest, ASGA0072056, is located on SSC16 within the RETREG1 gene, responsible for encoding the reticulophagy regulator 1.

FIGURE 6
Comparison of the overall reassigning probability to actual breed estimated with DAPC using the initial 300 SNPs and the breedinformative selected 28 SNPs.
Dysregulation of the RETREG1 gene has been linked to the development of numerous diseases (Islam et al., 2018).In the context of viral diseases, other studies have highlighted the relationship between the absence of the RETREG1 protein and heightened replication of Dengue and Zika viruses (Lennemann and Coyne, 2017).ASGA0008283 is an intergenic variant on SSC1.ASGA0072056 and ASGA0008283 have been shown to be determinant factors in tracing the breeding farm of domesticated pigs (Kwon et al., 2017).
Lastly, ALGA0078229 is situated on SSC14 within the RET gene, which encodes the proto-oncogene tyrosine-protein kinase receptor RET.Dysregulation of RET has been implicated in the development of various tumor types (Zhao et al., 2023).Additionally, a previous study found a significant association between ALGA0078229 and meat quality in German Landrace pigs (Ponsuksili et al., 2014).
Moreover, we conducted a comprehensive investigation to identify the biological processes associated with the SNPs that exhibited deviations from Hardy-Weinberg equilibrium.Notably, one genomewide association study demonstrated a significant association between ALGA0077162 and immune-relevant traits in the Landrace breed (Dauben et al., 2021).Additionally, ASGA0050304 was identified as a quantitative trait locus strongly linked to intramuscular fat (IMF) in the gluteus medius (GM) and longissimus dorsi (LD) muscles of Duroc pigs (González Prendes, 2017).
Regarding the Functional Enrichment Analysis, our results have revealed three enriched functions that involve three important parts: the nucleosome, the DNA packaging complex, and the structural  components of chromatin.These components play crucial roles in DNA packaging, organization, and gene expression, thereby ensuring the efficient functioning of critical nuclear processes such as transcription, replication, and DNA repair (Alberts et al., 2002).Nucleosomes were identified as the most significant function with the lowest p-value.Previous studies have demonstrated a correlation between increased circulating nucleosomes and inflammation as well as autoimmune diseases (Schwarzenbach et al., 2011;Pisetsky, 2012).Therefore, nucleosomes are believed to have the potential to initiate immune responses (Rönnefarth et al., 2006).Moreover, the activation of chromatin is vital for the immune response, with receptor engagement triggering reaction cascades that activate transcription factors and the chromatin template (Paz and Josefowicz, 2021).This synergistic activation of select genes is particularly evident in macrophages during inflammation, where they can rapidly express hundreds of genes (Paz and Josefowicz, 2021), thus highlighting the intricate relationship between chromatin dynamics and immune processes.
Investigating these functions and their underlying molecular mechanisms could offer new insights into the regulation of gene expression associated with chromatin abnormalities.In summary, our study highlights the effectiveness of DAPC in evaluating the genetic structure and admixture levels of pig breeds.The obvious breed-specific separation of individuals seen in the DAPC and PCA plots supports our findings that these three pig breeds have distinct genetic backgrounds.Despite using only coding variants, the SNPs selected by the DAPC approach were able to assign individuals to their respective breeds with a 74% probability of correct assignment.Although this may not match the assignment rate achieved with the full dataset, it is still a significant accomplishment and highlights the importance of carefully selecting impactful genetic markers for analysis.As a result, targeting coding regions associated with traits of interest provides a more straightforward analysis of genome-wide variants and yields more explicit results.
The SNPs discovered in this study have the potential to be used as markers for pig breed identification and conservation initiatives.Further research with larger sample sizes can provide a more comprehensive understanding of the genetic structure of these pig breeds and identify additional coding SNPs that contribute to breed differentiation.By conducting further investigations and experiments, we can gain a deeper understanding of the functional significance and underlying mechanisms of these identified SNPs.

Conclusion
This study highlights the significant genetic variation present in gene-coding regions among three Italian pig breeds.The Landrace and Duroc breeds were found to be highly divergent, while the Landrace and Yorkshire breeds exhibited closer genetic similarities.
Notably, we identified 28 coding SNPs that were particularly informative in differentiating between these breeds, with enough genetic information to form distinct clusters of individuals.Investigating the signaling pathways and functional implications of these SNPs could provide valuable insights into the underlying genetic mechanisms that contribute to breed differentiation.While whole-genome analysis can determine genetic diversity, focusing on breed-specific coding SNPs can streamline the analysis by targeting specific regions relevant to the research question.

FIGURE 2
FIGURE 2Ternary plots illustrating the patterns of Hardy-Weinberg (HW) proportions.Each vertex on the plot represents a different genotype: homozygous for the reference allele (AA), heterozygous (AB), and homozygous for the alternative allele (BB).The plots highlight loci that deviate significantly from Hardy-Weinberg equilibrium, and these loci are indicated in pink.The blue parabola on each plot represents Hardy-Weinberg equilibrium, while the area between the green lines represents the acceptance zone.The plots provide a visual representation of the distribution of the SNPs in relation to the Hardy-Weinberg equilibrium and allow for the identification of loci that may be under selection or experiencing other evolutionary forces.

FIGURE 3
FIGURE 3 Distance measures between pig populations.(A) Pairwise F ST , (B) Pairwise G ST , and (C) Euclidean Distance.The warmer the color, the more the two breeds concerned are genetically distant.

FIGURE 4
FIGURE 4 Visualization of the distribution of the 300 individuals according to the 300 SNPs (A) considering the first two discriminant functions, and (B) considering the first discriminant function only.

FIGURE 5
FIGURE 5Two-Dimensional visualization of pig individuals distribution based on the 28 most informative SNPs using the first and second principal components.

FIGURE 7 A
FIGURE 7A graphical representation of the adjusted p-values in the negative log10 scale for enriched functions obtained from various databases, including Gene Ontology Molecular Functions (GO:MF), Gene Ontology Cellular Components (GO:CC), Gene Ontology Biological Processes (GO:BP), Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome Pathway (REAC), micro-RNA target (MIRNA), Human phenotype ontology (HP), and WikiPathways (WP).The enriched functions, namely, (1) nucleosome, (2) DNA packaging complex, and (3) structural component of chromatin, are plotted against their respective databases.

TABLE 1
Top 3 significantly enriched functions according to their p-values.Number of genes that are annotated to the term.The p-values are below 0.01 which indicate that the observed enrichment is statistically significant.
a The abbreviation of the data source for the term (Gene Ontology Molecular Functions (GO:MF), Gene Ontology Cellular Components (GO:CC)), b Unique term identifier, cThe short name of the function, d