Network Analysis Reveals Putative Genes Affecting Meat Quality in Angus Cattle

Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms and genes that lead to complex phenotypes, like meat quality, and the nutritional and healthfulness value of beef. Improvements in genome annotation and knowledge of gene function will contribute to more comprehensive analyses that will advance our ability to dissect the complex architecture of complex traits.


INTRODUCTION
Increasing consumer demand for beef is an important strategic objective for the beef industry and recent studies suggest consumers have a strong focus on beef quality (Igo et al., 2013;Schroeder et al., 2013). Beef quality is largely communicated in terms of USDA quality grade as higher quality grade beef will contain more intramuscular fat, which improves flavor, juiciness, and positively influences tenderness (Koohmaraie et al., 2002). These properties have a large impact on the eating experience of the consumer, and consumer's eating satisfaction is the main driver of beef demand (Schroeder et al., 2013). However, in addition to eating satisfaction, other attributes including nutritional value, and healthfulness (fatty acid composition and mineral content) are important components of quality in the eyes of modern consumers.
All the components defining eating quality can be regarded as quantitative traits, controlled by many genes and impacted by environmental factors. Most component traits are difficult and expensive to measure and not available to measure until late in life or after the animal has been harvested. Such traits are difficult to improve through traditional phenotypic selection, but are ideal candidates for genomic selection if genetic markers explaining a large enough proportion of the variation can be identified. Warner-Bratzler Shear Force (WBSF, an objective measure of tenderness) and the intramuscular fat content (IMFC) were identified from an extensive set of carcass and meat composition traits to be the best predictors of eating quality (Mateescu et al., 2016). Those indicator traits are difficult to measure on live animals and DNA tests that can accurately identify cattle with superior genetics for WBSF and IMFC would be helpful. Knowledge of the genetics controlling these traits along with a precise understanding of the biological networks and interactions underlying the meat quality complex will increase the ability of the industry to improve cattle to better meet consumer expectations.
Numerous genome-wide association studies (GWAS) have been performed in different Bos Taurus (Gutiérrez-Gil et al., 2008;Esmailizadeh et al., 2011;McClure et al., 2012;Allais et al., 2014;Xia et al., 2016), Bos Indicus (Tizioto et al., 2013;Magalhães et al., 2016) or crossbred beef cattle breeds (Bolormaa et al., 2011;Lu et al., 2013;Hulsman Hanna et al., 2014), and with different phenotypes describing meat quality, from carcass characteristics to specific measures of eating satisfaction. These studies contribute to our present understanding of the genetic regulation for many of these traits but they also highlight some of the challenges and limitations associated with GWA studies. Many chromosomal regions identified are unique to the specific population in which they were discovered and were not replicated in other studies. More importantly, very few functional mutations have been identified and most of the genetic variation controlling these traits remains unknown. Recently, new methodology has been developed in an effort to address this limitation and allow for a better understanding of the genetic architecture of complex traits through a gene network analysis (Fortes et al., 2010;Reverter and Fortes, 2013).
The first objective of this study was to carry out GWAS to identify chromosomal regions associated with each of the different components of meat quality. The second objective was to use the Association Weight Matrices (AWM) and Partial Correlation and Information Theory (PCIT) to explore the functional mechanisms underlying GWAS associations for meat quality traits in Angus cattle to explore the biological mechanism by which GWAS-identified genomic variants give rise to phenotypic differences in eating quality.

Animals and Sample Collection
The Iowa State University and Oklahoma State University Institutional Review Boards approved the experimental protocols used in this study. A total of 2,110 Angus-sired animals comprising bulls (n = 500), steers (n = 1,210), and heifers (n = 400) representing 155 sires were used in this study. All cattle were finished on concentrate diets in Iowa (n = 994), California (n = 345), Colorado (n = 352), or Texas (n = 419). Animals with an average age of 457 ± 46 days were harvested at commercial facilities. Details on production characteristics, meat sample collection, and preparation have been previously reported (Garmyn et al., 2011). Two 1.27-cm steaks from the longissimus muscle were trimmed of external fat and connective tissue and were analyzed for fatty acid and nutrient composition at Iowa State University (Ames, IA), using methods previously described (Garmyn et al., 2011;Mateescu et al., 2012) and for WBSF and sensory analyses at Oklahoma State University Food and Agricultural Products Center (Stillwater, OK) (Mateescu et al., 2015). Four carcass phenotypes: hot carcass weight (HCW), percentage kidney pelvic and heart fat (KPH), ribeye area (REA), and fat thickness (FAT); five meat quality phenotypes: marbling score (MS), IMFC, WBSF, sensory panel tenderness (TEND), sensory panel juiciness (JUIC); seven mineral concentrations: calcium, iron, magnesium, phosphorus, potassium, sodium, and zinc; four peptides: anserine, carnosine, creatine, and creatinine; and three groups of fatty acids: saturated (SFA), monounsaturated (MUFA), and polyunsaturated (PUFA) were used in this study. A description of the 23 traits along with a summary of descriptive statistics for this population is in Supplementary  Table 1.

Genome-Wide Association Study (GWAS) of Meat Quality Phenotypes
Genomic DNA extracted from the meat sample was genotyped with the Bovine SNP50 Infinium II BeadChip (Illumina, San Diego, CA). Those SNP with significant deviations from Hardy-Weinberg equilibrium at a significance level P < 0.0001 were removed prior to association analysis. Additionally, we used quality control filters for minor allele frequency (5%) and call rate for sample and SNP (95%). After quality control, 40,875 SNP were left and included in subsequent analyses. All GWAS were performed using the single-locus mixed linear model procedure implemented in Golden Helix SVS v8.4.4 software (Golden Helix Inc., Bozeman, MT, USA). The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the genetic and residual variance components σ 2 g and σ 2 e and the proportion of variance explained by the effects of significant SNP (Kang et al., 2010;Segura et al., 2012). In matrix notation, the basic model equation was: Where Y is a vector of phenotypes for each of the meat quality traits measured on all the animals, β is the effect size of fixed effects (contemporary groups), g ∼ N(0, σ 2 a K) is a random effect and e ∼ N(0, σ 2 e I), where K is the genomic relationship matrix among animals.
Contemporary groups were defined based on gender at harvest (bull, steer, or heifer), finishing location (California, Colorado, Iowa, Texas), and harvest date, which resulted in a total of 33 groups. Contemporary groups were fit as fixed class effects in all genomic analyses. Pseudo-heritability was estimated as h 2 = σ 2 a /(σ 2 e + σ 2 e ) based on the estimates of the variance parameters (Kang et al., 2010). The p-values and additive genetic values for each SNP were obtained for each phenotype and these were used to construct the association weight matrix (AWM; Reverter and Fortes, 2013).

Association Weight Matrix
The AWM approach (Reverter and Fortes, 2013) was used to interpret the results from GWAS. The WBSF was selected as the key phenotype to describe the complex of traits related to tenderness and meat quality. An initial set of 1,842 SNP with largest estimated additive effects for WBSF were selected based on their raw P < 0.05. A less stringent level at this stage is recommended to allow for a proper integration of potentially important regulators across multiple traits. One advantage of the AWM/PCIT methodology is the ability to include SNP with relatively small effects which do not reach genome-wide statistical significance but are potentially linked to elements controlling the trait of interest. It is well-recognized that many elements with minor effects are usually not able to reach significance at the genome level, but they will be uncovered through a gene network when multiple correlated traits are used in the analysis  (Fortes et al., 2010). The average number of other phenotypes associated with these SNP at a P < 0.05 was calculated and 1,318 SNP associated with at least two phenotypes were included in the AWM. To build the AWM, a vector of posterior mean estimates of 1,318 SNP effects from WBSF was enhanced with the vectors of effects of all the other 22 phenotypes. This 1,318 × 22 matrix of posterior mean estimates of SNP effects was used as the input for PCIT to detect similar effects for any SNP across multiple phenotypes. All SNP pairs within the matrix were tested for association with at least one other SNP in order to establish network connections. SNP pairs without a significant partial correlation to at least one other SNP were removed from the dataset to discard them from subsequent network association analysis since they would appear isolated.  Networks of SNP showing common effects across multiple quality traits were constructed based on the computed correlations among SNP. Correlation between SNP pairs with a non-zero partial correlation to another SNP were input into Cytoscape 3.5.1 (Shannon et al., 2003) software to create gene network clusters using the MCODE plugin (Bader and Hogue, 2003;Saito et al., 2012). Networks were scored and ranked by the MCODE algorithm as network density times Frontiers in Genetics | www.frontiersin.org

Gene Ontology Enrichment Analysis and Visualization
DAVID v6.7 Functional Annotation Tool (Huang et al., 2009) was used for gene ontology (GO) enrichment in order to detect enriched biological terms associated with genomic regions and gene networks identified in the analysis. The GO term enrichment and clustering was performed on all annotated genes associated with the quality traits. Functional grouping based on kappa score and visualization in a functionally grouped network was performed using the ClueGO (Bindea et al., 2009) plug-in in Cytoscape. A P < 0.05 and kappa coefficient > 0.3 were considered as threshold values.

Meat Quality Genome-Wide Association Study
Summary statistics for carcass quality, meat quality, mineral content, fatty acid composition, and peptide content phenotypes are presented in Table 1 along with heritability estimates for each trait and general GWAS information. Complete GWAS results for all 23 individual meat quality traits are presented in Supplementary Table 1. The GWAS for our main meat quality trait, WBSF, resulted in 1,878 SNP associated with this trait at P < 0.05, of which there were 383 SNP at P < 0.01 and 56 SNP at P < 0.001. A list with detailed information on the top 35 markers (P < 0.00005) associated with WBSF is in Table 2, and additional information in Supplementary Table 2. The number of significant SNP was similar across all 23 traits and ranged from 1,729 to 1,971 at P < 0.05, from 331 to 428 at P < 0.01, and 25 to 112 at P < 0.001. There were 68 SNP significantly associated with 10 or more traits at P < 0.05 and 7 SNP significantly associated with 15 or more traits ( Table 3, additional information  in Supplementary Table 3). The most significant regions for WBSF were identified, in order of significance, on BTA29, 20, 10, 7, 3, and 4. Most of these chromosomal regions harbor potential candidate genes for tenderness that have been identified in other studies in several cattle breeds. Among these, CAST (on BTA7) and CAPN1 (on BTA29) have been consistently identified and have a role in muscle proteolysis during meat aging (Smith et al., 2000). In fact, 13 out of the 56 SNP significant for WBSF at P < 0.001 were located in a 3 cM region around CAPN1 (three SNP directly in CAPN1) and four of the 56 SNP were located around CAST.

Meat Quality Gene Networks
Among the 1,842 SNP significant (P < 0.05) for WBSF, there were 839 SNP associated with at least two other phenotypes. Some 772 SNP were found to be located within a gene (n = 712) or within 2.5 kbp from a gene (n = 60) and therefore were used to form the AWM. A total of 688 annotated genes were found associated with at least one other gene and had significant direct and partial correlations. This correlation matrix generated a gene network consisting of 688 genes (nodes) and 99,568 gene relationships (edges). The Cytoscape MCODE plugin colocalized these SNP into 17 separate networks and detailed information on the top five networks is in Table 4. Nodes with no gene or feature annotation were removed for visual simplicity from the figures. The significance of each node is indicated by its location within the network, and the distance from the center indicates the total number of connections and importance to the phenotype. A direct correlation detected through the PCIT analysis is represented as a connection or edge in the network. The highest scoring network contained 324 nodes and 53,424 edges, or connections. The clusters of genes represent scored networks derived through the PCIT analysis, and theoretically these clusters of genes function as molecular complexes controlling the specified phenotype.
There are numerous candidate genes within these networks involved in metabolic and cellular processes that have possible impacts on meat quality traits. The genes CAPN1 and CAST, are well-known candidate genes for tenderness and meat quality traits (Goll et al., 1992;Geesink and Koohmaraie, 1999;Page et al., 2002;Casas et al., 2014), and are identified as major nodes in two subnetworks. There are many other candidate genes with a high network score indicating a high number of direct and indirect correlations with supporting evidence in the literature for their relationship to muscle growth and metabolism, calcium metabolism, adipogenesis, extracellular matrix protein interactions, and regulation. The gene MYOM1 (myomesin 1) is expressed in muscle cells and contributes to the three-dimensional conformation stability of the thick filament (Moreno-Sánchez et al., 2010;Picard et al., 2015). The CALCOCO1 gene (Calcium Binding and Coiled-Coil Domain 1) was shown to provide a link between cellular metabolism (phosphate and glucose metabolism), protein synthesis and degradation, calcium signaling and cell growth (Yang et al., 2006). The gene ALDOA (Aldolase A), that may encode a scaffolding protein, plays a key role in glycolysis and gluconeogenesis (Hocquette and Gigli, 2005;D'Alessandro and Zolla, 2013;Gobert et al., 2014). Among the 3 isozymes (A, B, and C), Aldolase A is present in the developing embryo and it is found in greater quantities in the skeletal adult muscle where it accumulates around the M line and within the I band, localizing with FBP2 on both sides of the Z line in the absence of calcium. ADAMTS15 encodes a member of the ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) protein family (Stanton et al., 2011). The encoded preproprotein is proteolytically processed to generate the mature enzyme, which may play a role in versican processing during skeletal muscle development (De Jager et al., 2013;Mudadu et al., 2016). KLHL2 is a component of an ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, which most often leads to their proteasomal degradation and plays a role in the actin cytoskeleton reorganization. The CRTAC1 gene encodes a glycosylated extracellular matrix protein located in the interterritorial matrix of articular deep zone cartilage and the protein may be involved in cell-cell or cell-matrix interactions (Anjos et al., 2017). Overall, these examples provide strong evidence that the network methodology used in this study allows the co-localization of biologically relevant genes with a close relationship to different aspects of meat quality variation. TNS4 (tensin 4) encodes an actin binding protein involved in cell migration, cartilage development and in connecting signal transduction pathways to the cytoskeleton ( Van de Werken et al., 1993;Lo, 2004). The gene encoded by COL27A1 is a member of the fibrillar collagen family, and plays a role during the FIGURE 4 | Functionally grouped network for meat quality complex in Angus cattle. Nodes represent functional terms linked based on their kappa score level (>0.3) with only the most significant term per group shown as a label. The node size represents the enrichment significance of the term. Only genes in common between two or more GO terms are used.
Frontiers in Genetics | www.frontiersin.org calcification of cartilage and the transition of cartilage to bone (Pace et al., 2003).

Gene Ontology Term Enrichment Analysis
Gene ontology and pathway enrichment analyses were carried out to gain insight into the predicted gene networks using PANTHER Overrepresentation Test and DAVID Functional Classification Clustering tools. The PANTHER classifications are presented according to molecular function, biological process, and cellular component in Figures 1-3.
Significant results for the DAVID Functional Annotation Clustering results for the gene networks are in Table 5.
An enrichment score of 1.3 was used as a significance threshold for DAVID Functional Annotation Clusters while a P < 0.05 was used to designate the functional annotation chart GO terms as significantly enriched (Huang et al., 2009). The false discovery rate (FDR) included in the Functional Annotation Chart can be used to determine the importance of terms considered significant through the P-value statistic.
A functionally grouped annotation network (Figure 4) was developed based on 576 unique and annotated genes from the AWM/PCIT analysis and the network was visualized using the ClueGO plug-in for Cytoscape. Only 544 genes were recognized by ClueGO, 454 (83.46%) were functionally annotated in the "Molecular Function" ontology, and 442 (81.25%) were associated with representative terms and pathways after applying the selection criteria. Twenty-two GO terms were significantly represented in this network. The most representative term was "Binding" with 358 genes and 3.58% associated genes, followed by "Ion Binding, " "Protein binding, " Catalytic activity, " and "Organic Cyclic compound binding." Higher connectivity between GO terms with similar molecular function are to be expected, but a high priority in terms of future research will be placed on genes common between several different GO terms as these might point toward key regulator genes with higher impact on the meat quality complex. The type of analyses used in this study and aimed at dissecting and understanding the gene networks and their contribution to the phenotypic expression of complex traits is highly dependent on the level of annotation of the respective genome but will further our general knowledge of gene function.
The gene network technique employed in this work advances the genomic analysis of complex traits beyond the simple marker association analysis by allowing the inclusion of markers which initially are not able to reach a very stringent genome-wide significance status. These markers and the genomic regions they represent could be legitimate markers and regions explaining a small portion of the variation in these complex traits, but they do not have large enough effects in order to reach significance. The danger of a false positive is overcome through the gene/network enrichment analysis where a true false positive gene would most likely be eliminated while genes with a real but small effect on the trait will be validate through their biological role in a specific pathway contributing to trait of interest. However, it is important that these results are validated through additional functional analyses at the gene expression or proteomics level.

CONCLUSION
Traits including four carcass measures, five meat quality phenotypes, seven mineral concentrations, and four peptide concentrations were used in GWAS to populate a gene network analysis using the methodology of AWM/PCIT. An analysis of genomic regions that affect different aspects of meat quality highlighted genes overrepresented in molecular functions related to calcium and other ion binding and regulation, catalytic and transporter activity, and nucleic acid and RNA binding. Several genes were found in a majority of enriched pathways suggesting possible key regulatory roles for these genes. This also provides evidence for the interconnections between the individual pathways and sheds some light on how these different pathways control the meat quality phenotype. The combination of GWAS results with PCIT and network visualization represents a powerful methodology for identifying novel candidate genes of interest for complex traits influenced by multiple component phenotypes. This methodology allows for a dissection of the biological mechanisms and gene networks that lead to these complex phenotypes.