Genetic and Genomic Characterization of a New Beef Cattle Composite Breed (Purunã) Developed for Production in Pasture-Based Systems

Purunã is a composite beef cattle breed, developed in Southern Brazil by crossing the Angus, Charolais, Canchim, and Caracu breeds. The goal of this study was to perform the first genetic characterization of the Purunã breed, based on both pedigree and genomic information. For this, 100 randomly selected animals were genotyped, and 11,205 animals born from 1997 to 2019 had pedigree information. The genetic analyses performed were principal component analysis, admixture, phylogenic tree, pedigree and genomic inbreeding, linkage disequilibrium (LD), effective population size (Ne), consistency of the gametic phase, runs of homozygosity (ROH), heterozygosity-enriched regions (HERs), and functional analyses of the ROH and HER regions identified. Our findings indicate that Purunã is more genetically related to the Charolais, Canchim, and Angus breeds than Caracu or Nellore. The levels of inbreeding were shown to be small based on all the metrics evaluated and ranged from −0.009 to 0.029. A low (−0.12–0.31) correlation of the pedigree-based inbreeding compared to all the genomic inbreeding coefficients evaluated was observed. The LD average was 0.031 (±0.0517), and the consistency of the gametic phase was shown to be low for all the breed pairs, ranging from 0.42 to 0.27 to the distance of 20 Mb. The Ne values based on pedigree and genomic information were 158 and 115, respectively. A total of 1,839 ROHs were found, and the majority of them are of small length (<4 Mb). An important homozygous region was identified on BTA5 with pathways related to behavioral traits (sensory perception, detection of stimulus, and others), as well as candidate genes related to heat tolerance (MY O 1A), feed conversion rate (RDH5), and reproduction (AMDHD1). A total of 1,799 HERs were identified in the Purunã breed with 92.3% of them classified within the 0.5–1 Mb length group, and 19 HER islands were identified in the autosomal genome. These HER islands harbor genes involved in growth pathways, carcass weight (SDCBP), meat and carcass quality (MT2A), and marbling deposition (CISH). Despite the genetic relationship between Purunã and the founder breeds, a multi-breed genomic evaluation is likely not feasible due to their population structure and low consistency of the gametic phase among them.


INTRODUCTION
The characterization of the population structure and genetic diversity is essential for the understanding of the genetic background of environmental adaptation and conservation of cattle genetic resources (Xia et al., 2021). Such characterization and diversity assessment need to be considered when designing or updating breeding programs and conservation strategies that can be applied in purebred and crossbred populations.
The Purunã breed is a composite population developed in Southern Brazil by crossing Angus, Charolais, Canchim, and Caracu, in identical proportions. This was performed to improve key traits of interest and exploit the complementarity among the breeds (Otto et al., 2021), especially for production in pasture-based systems. The background research to generate the Purunã breed started at the Agronomic Institute of Paraná (IAPAR; Ponta Grossa, Paraná, Brazil) at the beginning of the 1980s, when IAPAR researchers estimated the heterosis in the crossbred progenies of Charolais x Caracu and Angus x Canchim (Perotto et al., 2000a;Perotto et al., 2000b). Almost 15 years later, the first results were obtained from this experiment, where the heterosis retained from those crosses resulted in higher hot carcass weight, hot carcass yield, rib-eye area, better carcass conformation from Charolais x Caracu (Perotto et al., 2000b), and higher average weight daily gain in different ages with the crosses of Angus x Canchim (Perotto et al., 2000a).
Based on the first results, IAPAR researchers conducted a second mating to generate another set of animals using the progenies resulting from the previous F1 population. The goal at that point was to combine all favorable characteristics in a composite that presented a heavyweight and produce a highquality carcass. The hypothesis for using the breeds mentioned earlier was to capture a particular contribution from each breed to create a composite population with higher productive performance and adapted to the tropical and subtropical regions of Brazil. The Angus breed provided traits related to precocity, more docile temperament, and high meat quality (Cristiana and Mirela, 2018;Taye et al., 2018); Charolais provided a higher weight gain and carcass yield (Jahuey-Martínez et al., 2019), and finally, Caracu and Canchim contributed with rusticity, heat tolerance, and parasite resistance (Urbinati et al., 2016;Pires et al., 2021). Such animals are very well adapted to tropical environmental conditions and showed good potential to gain weight (Ito et al., 2010). The characterization of Purunã, defined by the Brazilian Purunã Cattle Breed Association (Ponta Grossa, Paraná, Brazil), is that the animals must present short hair with a shiny aspect, admitting variation on the coat color (red, white, black, and bay), medium-to-large size, and good muscle distribution as shown in Figure 1. Additionally, the animals are expected to be docile and prolific, with sexual precocity and fast carcass finishing.
Some studies in Brazil have evaluated the performance of the Purunã breed for carcass traits (Ito et al., 2010), meat production and quality traits (Missio et al., 2015), growth (Moura et al., 2014), and weight at different time points (Otto et al., 2021). These studies indicate animals slaughtered at 24 months of age, weighed an average of 460 kg, possessed a fat thickness close to 3 mm, and had a substantial concentration of fatty acids in the meat. In addition, estimates of genetic parameters for growth traits have demonstrated heritable estimates (0.05-0.21) for body weights measured at different ages (Otto et al., 2021). However, no previous research to date has evaluated the genetic diversity and population structure of the Purunã breed. Although the crossbreeding increases the genetic variability of a population, their development history and population management across generations could have impacted the genetic diversity of the population formed (Peripolli et al., 2020). Genetic diversity studies play an important role in the constitution of a crossbreeding program since how the variability is controlled may interfere with the heterosis produced and impact the expected hybrid vigor.
Genetic diversity studies are crucial in the initial phase, called prebreeding, in which it is possible to regenerate, characterize, explore, and promote the conservation of variability of the population (Pontes et al., 2020). The parameters estimated include inbreeding coefficients of the individual animals, the genetic relationship between animals, and overall levels of homozygous and heterozygous regions in the genome as well as their distribution along the chromosomes (Biscarini et al., 2020). Furthermore, linkage disequilibrium needs to be estimated for better implementing genomic selection and for identifying conserved segments of the genome among breeds (Larmer et al., 2014). All these metrics contribute to a better understanding of the genetic events that happen in the population, the impact of decisions made in the past, and the strategies that will be taken in the future. Our goal with this study was to characterize the genetic and genomic diversity and the population structure of a new composite beef cattle breed-Purunã, based on genomic and pedigree information.

MATERIAL AND METHOD
One hundred animals of the Purunã breed were randomly sampled and genotyped using the GGP Bovine 100K array (GGP, 2021) containing over 100,000 single nucleotide polymorphisms (SNPs). The genetic material was provided by the Agronomic Institute of Paraná (IAPAR, Ponta Grossa, Paraná, Brazil). For the genotype quality control (QC), only autosomal chromosomes were retained and a QC was performed separately for each analysis. For runs of homozygosity (ROH) and heterozygous-enriched regions (HER), we removed SNPs with call rate lower than 0.90, duplicated position, non-autosomes, or without a known position (Ferenčaković et al., 2013;Biscarini et al., 2020). For the other analyses, minor allele frequency (MAF <0.05) and extreme departure from the Hardy-Weinberg equilibrium (HWE <10 -6 ) parameters were also used to filter out SNPs.
For the pedigree database, information from 11,205 animals born between 1997 and 2019 was considered, where 5,224 were males and 5,981 females, and the base population was formed by 3,999 animals. These data were used to create the pedigree database including information on individual animals, sire, dam, sex, and birth date.

Principal Component Analysis
To assess the similarities between the Purunã breed and Angus, Canchim, Charolais, and Nellore breeds, we performed a principal component analysis (PCA) by PLINK v1.9 software (Purcell et al., 2007). The genotypes of the Angus, Charolais, and Canchim were retrieved from the WIDDE database (Sempéré et al., 2015), and the Nellore breed genotypes were provided by the Katayama Agropecuaria Ltda breeding company. The PCA was estimated based on the standardized variance of the genomic relationship matrix (G) where the covariance of each SNP was divided by the respective variance, using only the SNPs in common for all breeds (after the QC), as the following equation proposed by (VanRaden, 2008): where M is a matrix of counts of allele A, p i is the frequency of allele A of ith SNP, and P is a matrix with each row containing the p i values.

Admixture Analysis
The admixture analysis was performed using the ADMIXTURE software (Alexander et al., 2015) to assess the evolutionary history between the Purunã breed and its founder breeds (Angus, Charolais, Canchim, and Nellore). This analysis estimates ancestries by efficiently computing maximum likelihood estimates in a parametric model as (Alexander and Lange, 2011): where p ij is the success probability in the binomial distribution n ij Bin(2, p ij ) depending on the fraction q ik of i's ancestry attributable to population k and on the frequency f kj of allele 1 in the population k. The matrices Q = (q ik ) and F = (f kj ).
The success of the analysis is dependent on the correct choice of K, which represents the number of ancestral populations. We evaluated K equal to 1 until 20, but only K = 2 and 3 were chosen to be shown here, which have more biological interpretation and K = 3 had the smallest cross-validation error. The "pong" package (Behr et al., 2016) was used to cluster the results and visualize the population structure.

Phylogenetic Tree
To estimate the distance among the populations, we used the hapFLK software (Fariello et al., 2013) based on the approach described by Bonhomme et al. (2010). The neighbor-joining tree was built from the Reynolds' genetic distances (Reynolds et al., 1983) between pairs of populations. Reynold's distance was estimated using the co-ancestry coefficient, where this coefficient is the probability that a random pair of genes at the same locus within a randomly chosen population is identical-bydescent, providing a natural measure of genetic drift. It is assumed that the allele frequency is equal tô where p is the frequency, F is the co-ancestry matrix, andp 0 is the unbiased linear estimate with minimum variance, with 1′n denoting the n-vector made of 1's.

Inbreeding Metrics
Six models of inbreeding coefficient estimates were analyzed. The first model was based on pedigree information (F PED ), using the ENDOG v4.8 software (Gutiérrez et al., 2010), following the method proposed by Meuwissen and Luo (1992) in which the average F of a given generation t (F t ) was calculated as follows: in which ΔF is the change in the inbreeding rate from one generation to another, as the following equation: in which F t and F t-1 represent the average inbreeding estimates for the current and the previous generation (Falconer and Mackay, 1996). The second method was based on the homozygous genotypes observed and expected (F HOM1 ), calculated as follows (Purcell et al., 2007): where H exp is the expected value (proportion) for homozygous genotypes, and H obs is the observed value for the homozygous genotypes.
The third method was based on genotype additive variance (F GRM ), using the following model (VanRaden, 2008): where x i is the number of reference allele copies of the ith SNP, and p i is the reference allele frequency in the population. Similar to the second method, the methodology F HOM2 was based on homozygous genotypes following the model: The aforementioned models are all dependent on the genotype allele frequency, and for this reason, a fifth model was a test based on the correlation between uniting gametes (F UNI ) using the following model Yang et al. (2010): The last method was based on the sum of ROH individual length divided by the total length of the autosomal genome (F ROH ) using the following equation (McQuillan et al., 2008):  (Purcell et al., 2007). The PROC CORR option of the SAS statistical software (SAS Institute Inc., 2013) was used to correlate the inbreeding coefficient estimates. A heatmap was created for better visualization of the results through the "plotly" package (Sievert, 2020).

Linkage Disequilibrium
The linkage disequilibrium (r 2 ) was estimated by PLINK v1.9 software. To observe the r 2 decrease along with the increase in the marker distance, we used the binning approach estimating the r 2 average of each distance from 10 to 100 kb in each 10 kb, and after the distance of 100 kb in each 100 kb until the distance of 1,000 kb (1 Mb). As a preliminary analysis, we defined that the bins reported in this study were required to have at least 50 pairwise markers to estimate the binned average of r 2 .

Effective Population Size
Two methodologies were used to estimate the effective population size (Ne). The first method used pedigree information through the following equation:  Ne where F t and F t−1 are the average inbreeding of offspring and their parents, respectively (Falconer and Mackay, 1996). The estimate was performed using the POPREP software (Groeneveld et al., 2009). The second method was performed using genomic information, and investigated with the relationship method between LD variances and Ne through the following formula (Corbin et al., 2012): where Ne is the effective population size at the tth generation, c t is the recombination rate for the physical distance between the markers, α is the probability for the occurrence of mutation, and r 2 is the LD value.

Consistency of the Gametic Phase
The consistency of the gametic phase (CGP) was taken by the square root of r 2 values adding the sign from the disequilibrium metric (D), as: where p(a) is the frequency of the haplotype-a, p(b) is the frequency of the haplotype-b, and p ab is the haplotype frequency with allele a on the first locus and allele b on the second locus. The CGP was assumed as the Person correlation between each founder breed and Purunã using the signed-squared-root values. To estimate the CGP, only the SNPs in common (after the quality control) between each breed pair were used to estimate the CGP based on the same distance and bin described in the LD section.

Proportion of Polymorphic SNPs and Distributions of SNPs by the MAF Range
The proportion of polymorphic SNPs, after QC, was calculated based on the MAF. The distributions of SNPs were calculated on 10 MAF ranges from 0 to 0.5 defined every 0.05 points in MAF.

Runs of Homozygosity
The PLINK v1.9 software was used for the ROH identification based on the following criteria: • One heterozygous and one missing SNP were allowed; • The window of the threshold used was 0.05; • The gap between consecutive SNPs could not be higher than 1,000 kb; • The minimum length of an ROH was 500 kb; • The minimum number of consecutive SNPs that create an ROH must be equal to or greater than 30; • The density of 1 SNP used in at least 50 kb; • A sliding genomic window was used with 50 SNPs.

Heterozygosity-Enriched Regions
The detectRUNS package (Biscarini et al., 2019) was used for the detection of HER following the consecutive-SNPs method. For the SNPs' consecutive analysis, the following parameters were considered: • a minimum number of 20 consecutive SNPs constitutes an HER; • a minimum length of 500 kb; • a minimum of two homozygous and one missing SNP is allowed; and • the maximum gap between consecutive SNPs could not be higher than 1,000 kb.
The genomic regions that showed at least 10% of the animals with HER were included in the subsequent functional analyses and phylogenetic tree.

Functional Analyses
The genomic regions considered as ROH and HER islands were used for genomic annotations. The GALLO package (Fonseca P. A. S. et al., 2020) was used for the annotation of genes in these regions, with the annotated data for Bos taurus from the Ensembl database (www.ensembl.org/Bos_taurus/Info/Index), version ARS-UCD1.2 (Rosen et al., 2020). Subsequently, the WebGestaltR package (Wang et al., 2020) was used for the Gene Ontology (GO) analyses to identify biological processes, molecular functions, and cellular components in which the positional candidate genes are involved in.

Principal Component Analysis
The PCA among the populations of Purunã, Angus, Canchim, Charolais, and Nellore is presented in Figure 2A. The first principal component (PC1) explained 20.2% of the variation among the populations, while the second principal component (PC2) accounted for 3.2%. As shown in Figure 2A, the animals are grouped within breeds, with no clear mixture between groups, even for composite populations such as Purunã. The breeds closer to the Purunã are Charolais, Canchim, and Angus. Figure 2B presented the admixture analysis for Purunã, Angus, Canchim, Charolais, and Nellore populations for K = 2 and 3. For K = 2, two groups were observed and the mixture between them indicates that two distinct founder populations (Bos taurus taurus and Bos taurus indicus) were used when developing the Purunã breed. In average, Angus had 99.2% and 0.9%, Canchim 60.6% and 39.4%, Charolais 89.6% and 10.4%, Nellore 99.8% and 0.2%, and Purunã had 80.8% and 19.2% from ancestral population Frontiers in Genetics | www.frontiersin.org

Admixture Analysis
July 2022 | Volume 13 | Article 858970 1 and 2, respectively. For K = 3, three groups were observed to affect the admixture analysis for the population in the study. This result (K = 3) indicates more contribution from Charolais and Canchim in the Purunã breed, following the Angus breed and a small proportion of the Nellore breed. Figure 2C shows the genomic population tree for the breeds Purunã (PUR), Angus (ANG), Charolais (CHL), Canchim (CAN), and Nellore (NEL). There is a division into groups on the tree but the distance from one group to another is not high (0.05). Nellore appears in one section while Purunã, Canchim, Charolais, and Angus are situated in three other nodes, grouping in accord with clades of breed proximity.

Inbreeding
The averages of inbreeding coefficients are presented in Table 1.
The average for the inbreeding coefficient estimated based on pedigree for all Purunã individuals (F PED2 ) was 0.002. The methods F HOM1 , F HOM2 , F UNI , and F GRM were the methods showing the lowest average values (−0.009), while the highest inbreeding coefficient average was obtained by the F ROH metric (0.029). The correlations among the inbreeding coefficients method are presented in Figure 3.

Linkage Disequilibrium
The average LD ranged from 0.43 to 0.04, with a distance between two markers of 10 to 1,000 kb, respectively. The general average of LD was 0.031 (±0.0517) at the average distribution of the markers 4.856 (±2.8890) Mb. The decrease in LD with an increase in the marker distance can be observed in Supplementary Figure S1.

Effective Population Size
The effective population size based on pedigree was 158 for the current generation. On the other hand, the genomic-based Ne differed based on the generation and software used. The SNeP software (Barbato et al., 2015) and the PLINK software enabled the estimation of Ne up to the 13th and 5 th generation back, respectively. For the SNeP, in the 13th generation, the Ne was 229, while for the PLINK, the result for the same generation was 207. The Ne estimated for the 5 th generation on PLINK was equal to 115.

Consistency of the Gametic Phase
The consistency of the gametic phase between Purunã and Angus, Canchim, Charolais, and Nellore is presented in Table 2.
The highest correlation, at 20 kb between SNP pairs, between Purunã and the other breeds was found with Charolais (0.43), followed by Canchim (0.42), Angus (0.40), and Nellore (0.27). The distance of 10 kb showed a lower number of pairwise markers than the threshold (<50) used as a criterion. Therefore, these results were not presented.

Proportion of Polymorphic SNPs and Distribution of SNPs by MAF Range
The proportion of polymorphic SNPs based on the MAF category were as follows: MAF 0.00-0.05 3,232 (3.67%); MAF 0.05-0.10 2,967

Runs of Homozygosity
A total of 1,839 ROHs were found for the Purunã breed. The distribution along all autosomal genomes can be observed in Figure 4A and the ROH length size division. The length of ROH observed here can be classified as 37.4% for <2 Mb; 25.3% as 2-4 Mb; 17.1% as 4-8 Mb; 7.6% as 8-16 Mb; and only 2.6% ROH greater than 16 Mb. The chromosome that presented the highest amount of ROHs was the BTA5, followed by the BTA1, where the concentration of ROHs >16 Mb was superior compared to all other autosomes. The chromosomes that showed the smallest number of ROHs were the BTA27 and BTA25, representing a small fraction of regions in ROH.

Heterozygous-Enriched Regions
In total, 1,799 HERs were found in the Purunã breed. The HER pattern distribution along with all autosomal genomes is shown in Figure 4B. Around 92.3% of the HERs found were classified in the length of 0.5-1.0 Mb; 7.0% as 1.0-1.5 Mb; and 0.7% as 1.5-2.0 Mb. No HER greater than 2 Mb was found for Purunã. The chromosome that presented the highest amount of HER was the BTA1, while BTA25 had the smallest number of HER.

Runs of Homozygosity
With the ROH analysis, we found a common region in homozygosity present in 36% of the animals, despite the fact that Purunã is a recently-developed composite breed. This region is located on BTA5 between 54, 304, 681 bp and 62, 031, 799 bp, and has a length of 7.73 Mb, where 131 SNPs are present in this region. This region is responsible for coding 220 genes, with 181 protein-coding genes, seven pseudogenes, five long non-coding RNA, nine microRNA, ten miscellaneous RNA, six small nucleolar RNA, and two small nuclear RNA. The list of all the genes found in this region is presented in Supplementary Table S1. The significant Gene Ontology (GO) terms (p < 0.05) in which these genes are part of are presented in Table 3. Ten biological processes, three molecular functions, and six cellular components were identified in the significant pathways. Interestingly, pathways linked to animal behavior were found in this region, including sensory perception (GO: 0007600), detection of stimulus (GO:0051606), response to extracellular stimulus (GO:0009991), olfactory receptor activity (GO:0004984), and others. To track the origin of this homozygous region in Purunã, we performed a  phylogenetic tree analysis, using only the SNPs allocated in this region. Figure 5A shows the phylogenetic tree for the homozygous region found in BTA5. In this particular region, the breeds Purunã, Charolais, and Angus are closer together in comparison to the Canchim and Nellore breed, indicating that Charolais and Angus might have contributed to key behavioral characteristics observed in the Purunã breed.

Heterozygous-Enriched Regions
For the HER analysis, the regions identified in at least 10% of the animals were considered as HER islands and used to verify the candidate genes and pathways. Table 4 presents the HER island found in the Purunã breed. We found 19 HERs distributed in 17 chromosomes, where the BTA5 and BTA14 presented two HERs in each chromosome. The most frequent HER (27% of the population) was found in BTA23. The longest HER was found in BTA22 with a length size of 3.67 Mb, and the smallest HER was found in the BTA15 at 0.60 Mb.
All these regions are responsible for coding 413 genes, including 363 protein-coding, six pseudogenes, 13 long noncoding RNAs, seven microRNAs, two miscellaneous RNAs, one small nucleolar RNA, 15 small nuclear RNAs, three processed pseudogenes, and three ribosomal RNAs. The list of all the genes found in these regions is presented in Supplementary Table S2. The significant GO terms (p < 0.05) and their related genes are presented in Table 5.
In total, we found 17 significant GO terms involved in biological processes, eight in molecular functions, and six in cellular components. Interesting regions related to the growth pathways (GO:0,040,007) were found in the heterozygousenriched regions in BTA10, BTA14, BTA18, BTA19, and BTA20, where 14 genes are acting in higher variability in the population. The genes are PPIB (peptidylprolyl isomerase B), SDCBP (syndecan binding protein), MT2A and MT3 components of metallothionein, RAI1 (retinoic acid-induced 1), FLCN (folliculin), DCAF1 (DNA damage-binding protein 1), CISH (cytokine-inducible SH2), HYAL1 and HAYAL2 components of hyaluronidase, SEMA3B and SEMA3F components of semaforin, ARIH2 (ariadne RBR E3 ubiquitin protein ligase 2), and IP6K2 (inositol hexakisphosphate kinase 2). To track the origin of these heterozygous regions in Purunã, we used a phylogenetic tree analysis. Figure 5B shows the phylogenetic tree for the regions related to the growth pathway found in the heterozygous-enriched regions.

DISCUSSION
Our main goal in this study was to genetically characterize the Purunã breed by estimating genetic diversity and population structure parameters based on both genomic and pedigree information. This breed was developed by crossing, in the same proportion, Charolais, Canchim, Angus, and Caracu breeds, while Canchim is also a composite breed that has Charolais and Nellore as the main founder breeds. Therefore, the average genetic proportion for Purunã is 13/32 Charolais, 8/32 Caracu, 8/ 32 Angus, and 3/32 Nellore. Based on the results from the population stratification section, the Purunã breed seems to be genetically closer to Charolais, Canchim, and Angus, with the highest genomic contribution from the Charolais breed ( Figure 2B, K = 3). Yet, our findings indicate that Purunã is closer to the Bos taurus taurus than Bos taurus indicus breeds. This was expected due to the greater contribution of taurine breeds in the formation of the Purunã breed.

Inbreeding Metrics
The maintenance of low levels of inbreeding is also desirable in composite breeds, once an advantage of crossbreeding is heterosis. Such heterosis is influenced by the genetic distance between the parental breeds and the level of inbreeding in the population, which can affect the degree of heterosis retention (Peripolli et al., 2020). As shown in Table 1, the inbreeding coefficient for all the metrics estimated in this study ranged from −0.009 (±0.041) to 0.029 (±0.024). These results are expected as the Purunã breed is a recently developed composite breed. The low level of inbreeding, with more emphasis on negative values (outbreeding), indicates that the probability of the two homologous genes within an individual being identical-by-descendent is smaller than two homologous genes drawn at random from the reference population and the ancestry shared into the population is small (Wang, 2014). However, in terms of gain or loss variability to a reference base population (Villanueva et al., 2021), the values indicated that some variability had been gained (through migration or gene flow from other populations) or, in the cases where the inbreeding coefficient was positive, a slight loss of variability. An accurate measure of F PED could be expected when a complete, deep (many generations recorded) and no (or few) errors in the pedigree files. In the case of the Purunã breed, the information in the pedigree files goes to, on average, 2.35 generations. As the Purunã pedigree is shallow, the use of genomic information to estimate the inbreeding coefficients is a great alternative to access the inbreeding levels of the individuals. These inbreeding metrics, in addition, are not dependent on pedigree information, taking into account the Mendelian sampling variation (Doekes et al., 2019), the stochastic nature of recombination (Ferenčaković et al., 2013), and correcting the pedigree failure to assume that the founders of a population are unrelated (Rebelato et al., 2018). Yet, some metrics not only measure the levels of overall inbreeding but also give an estimate of when the inbreeding was created, as in the case of F ROH .
The F ROH captures the highest level of inbreeding, especially because the F ROH metric is capable of capturing both recent and more ancient inbreeding (Ghoreishifar et al., 2020). As shown in Table 1, the value for the ancient inbreeding coefficient (F < 8 MB) is higher than more recent inbreeding (F > 8 MB). Such ancient inbreeding could be provided by ancient generations in ancient mating and still be in the population passing it through generations. This division between ancient and recent inbreeding is helpful to manage the diversity in the population. As not all inbreeding is expected to be equally unfavorable, recent inbreeding is expected to have more negative effects than ancient inbreeding (Doekes et al., 2019), therefore maintaining a low level of recent inbreeding coefficient is a desirable goal. Figure 3 illustrates the correlation among the inbreeding coefficient metrics. All the metrics showed a low correlation with F PED . Some authors have already mentioned that the genomic inbreeding metrics are more accurate in assessing individual inbreeding (Curik et al., 2014;Marras et al., 2015;Doekes et al., 2019). This happens due to the particularities mentioned before about the pedigree estimation, but as the F PED , each metric used to calculate the genomic inbreeding coefficient has its specificities and captures a different type of inbreeding that was originally defined by Wright (1922) and/or Malécot (1948. The genomic metrics vary according to the weight that each marker gets to find the G matrix or the allele frequency for each marker (Howard et al., 2017). This affects how the inbreeding is calculated for each individual and the correlation among the metrics. The metrics F HOM and F ROH weigh all the alleles equally, while the metrics F UNI and F GRM give more weight to rare alleles (Alemu et al., 2021). This could explain why the metrics F HOM and F ROH , and F GRM and F UNI show moderate to strong correlation, while the F GRM and F HOM or the F ROH classes had a negative correlation.

Linkage Disequilibrium, Effective Population Size, and Consistency of the Gametic Phase
Higher LD values were observed for markers located closer to each other and a faster decreased LD values were found as the  distance between the markers increased, as observed in other crossbred or composite populations (Prieur et al., 2017;Deng et al., 2019). The extent of LD is strongly influenced by the population history, particularly in domestic animal populations, which have undergone bottlenecks during both domestication and the subsequent formation of breeds (Brito et al., 2015). Such LD is directly related to genomic selection, where the number of markers required to accurately predict breeding values depends on the LD (Larmer et al., 2014). Following the proposed equation by (McKay et al., 2007), the number of markers required for accurate genomic selection will be around 95,000 markers (2.67 GB/30 kb at LD = 0.2) for the Purunã breed. However, it is essential to highlight that for an implementation of genomic selection in Purunã, it is crucial that a sizable training population needs to be generated, to provide accurate genomic predictions of breeding values and selection. Analysis of LD plays a central role in many areas of population genetics, including the determination of genetic maps, ascertainment of levels of recombination at the population level, and Ne estimation (D'Ambrosio et al., 2019). Based on all the metrics, the Ne estimates for Purunã are higher than 100 in the current generations, which is a threshold proposed by Meuwissen (2009) to ensure long-term population sustainability. The Ne estimate based on LD was able to be detected up to the fifth generation ago using the PLINK software (Purcell et al., 2007). We observed a slight divergence between the results from the SNeP and PLINK software, but not as high as reported by Barbato et al. (2015).
Understanding the LD levels, population structure, and CGP across breeds are crucial for implementing genomic selection (Brito et al., 2015). The CGP for all the evaluated breeds, including the Purunã, resulted in a low correlation, as shown in Table 2. These results indicate that the markers' phase (or the phase between markers and QTL) is not consistent across breed pairs. In this context, the possible use of a multi-breed training population for genomic evaluations using these breeds (Purunã, Charolais, Canchim, and Angus) might not result in more accurate genomic breeding values. As the markers are not in the same phase across breeds, the ability to use one breed to determine the effects of SNP to aid in the selection of another population becomes less likely (Larmer et al., 2014). Figure 4A shows the number of ROHs found by chromosome in Purunã. The BTA5 showed a higher number of ROHs, as also observed in other beef cattle studies (Peripolli et al., 2018;Peripolli et al., 2020). The majority of ROH found (62.7%) were classified as short ROHs, and as the length of ROH is negatively correlated with the co-ancestry (Mastrangelo et al., 2018), the ROH found in Purunã were conceived in a more ancient generation. Taking the length of ROH and using the studies that estimate the ROH and correlate with the generation, as the work of Howrigan et al. (2011), the majority of ROH found in this study was created between 10 and 20 generations ago.

Runs of Homozygosity
The ROH can be used for genome characterization and a better understanding of the implications of selection pressure (Marras et al., 2018). An interesting region was identified in the BTA5, which contains significant pathways related to behavioral traits. The first pathway was the sensory perception associated with a series of events required for an organism to receive a sensory stimulus, convert it to a molecular signal, and recognize and characterize the signal (AmiGO, 2021c). The second was the detection of a stimulus pathway related to a stimulus received by a cell or organism. This pathway converts a signal into a response to an extracellular stimulus, associating any movement, secretion, enzyme production, or gene expression in an extracellular stimulus (AmiGO, 2021a). The third pathway was olfactory receptor activity, a pathway related to combining with an odorant and transmitting the signal from one side of the membrane to the other to initiate a change in cell activity in response to the detection of smell (DeMaria and Ngai, 2010).
To track the origin of such a region, a phylogenetic tree ( Figure 5A) was made to evaluate which breed could provide this region. As shown in Figure 5A, the Purunã, Charolais, and Angus animals seem to be genetically closer, and therefore, Angus and Charolais might have contributed to this region. Some studies have previously reported the same region with a high incidence of homozygous sequence (Szmatoła et al., 2019;Fabbri et al., 2021). Some interesting genes, already mentioned in the literature as candidate genes were identified in this region, as MY O 1A (Myosin IA) related to bovine heat-tolerance (Jia et al., 2019), RDH5 (11-cis retinol dehydrogenase 5) associated with feed conversion (de Almeida Santana et al., 2016), and AMDHD1 (amidohydrolase domain containing 1) related to reproduction (Moravčíková et al., 2019).

Heterozygous-Enriched Region
Maintaining diversity at a locus may be advantageous for fitness and could be subject to balancing or countervailing selection (Williams et al., 2016). These heterozygous-enriched regions are single nucleotide differences observed between paternal and maternal chromosomes and can reveal much about the population structure and demographic history . As shown in Figure 4B, the majority of HER found in this study were classified as shorter HER. Interestingly, one region is already mentioned as a conserved region for beef and dairy cattle in the BTA14 (Zhao et al., 2015). This region is variable in at least 17% of the Purunã individuals, demonstrating that even in more conservative regions, the crossbred could provide some variability to the animals.
Although some studies have shown that the majority of HER islands are related to immunity to diseases (Williams et al., 2016), survival rate, and fertility (Biscarini et al., 2020), an interesting pathway was found in our study related to growth. The growth pathway is a biological process related to the increase in size or mass of an entire organism, a part of an organism, or a cell (AmiGO, 2021b). In the case of PPIB, a gene used as a reference gene in studies of gene expression da Costa et al., 2013) or the SDCBP gene, a possible candidate gene related to carcass weight in Hanwoo, a Korean native breed (Lee et al., 2013) and Montana Tropical Composite, a composite beef cattle population developed in Brazil . Another gene mentioned as a possible candidate gene for meat quality and carcass yield was the MT2A, which is involved in glucocorticoid response and with metal and antioxidant biological responds (Haegeman et al., 2003). Although the CISH gene is directly related to insulin metabolism, Fonseca L. F. S. et al. (2020) indicated that this gene could play an essential role in marbling deposition. Our tracking of this pathway was not possible to define a unique breed responsible to provide such a region of HER to Purunã breed or even the breeds where the population is closer, as shown in Figure 5B. This means that such variability is not provided by a unique or small group of breeds, but by the mixture of the breeds used in the creation of the Purunã breed.

CONCLUSION
As observed in the admixture analyses, the Purunã breed received a more significant genetic contribution to its formation from Charolais, Canchim, and Angus. The inbreeding levels for Purunã were small based on multiple inbreeding metrics. Higher LD values were observed for markers with small distances and a faster decrease associated with an increase in the distance between the markers (ranging from 0.43 to 0.04 with distance of 10-1,000 Kb), indicating that a denser panel of markers is necessary to achieve higher levels of accuracy in a genomic selection of Purunã. A high Ne (>100) was observed in all metrics evaluated and the consistency of gametic phase analyses resulted in a small correlation among all breeds, which determines that a multibreed genetic evaluation for Purunã might not be advantageous. An interesting homozygous region was found in the BTA5 with significant pathways related to behavior and genes related to traits such as heat-tolerance (MY O 1A), feed conversion rate (RDH5), and reproduction (AMDHD1). This could indicate a possible pressure of selection in such regions. For the heterozygosity, the number of HER was elevated, but this was expected since Purunã is a composite breed. Among HER regions, an interesting pathway related to growth was identified with higher variability, containing genes previously associated with carcass weight (SDCBP), meat and carcass quality (MT2A), and marbling deposition (CISH).

DATA AVAILABILITY STATEMENT
The Purunã genotypes data are available in the OSF Repository (https://osf.io/7p6wt/). Genotypes from Angus, Canchim, and Charolais are available in the WIDDE database (http://widde. toulouse.inra.fr/widde/). The Nellore datasets presented in this article are not readily available because genotypes from Frontiers in Genetics | www.frontiersin.org databases Katayama are not publicly available but can be obtained through a reasonable request via the corresponding author. Requests to access the datasets should be directed to vbpedrosa@uepg.br.

ETHICS STATEMENT
Ethical review and approval were not required for the animal study because no new animals were handled in this experiment.

AUTHOR CONTRIBUTIONS
HM, LFBR, and VP conceived, designed, and conducted the data analyses. JM, LS, and VP contributed to the data acquisition. HM, LFBR, LD, and VP wrote and edited the manuscript. All authors reviewed and contributed to editing of the manuscript and approved its final publication.