Comparative Genome-Wide Alternative Splicing Analysis of Longissimus Dorsi Muscles Between Japanese Black (Wagyu) and Chinese Red Steppes Cattle

Alternative splicing is a ubiquitous regulatory mechanism in gene expression that allows a single gene generating multiple messenger RNAs (mRNAs). Significant differences in fat deposition ability and meat quality traits have been reported between Japanese black cattle (Wagyu) and Chinese Red Steppes, which presented a unique model for analyzing the effects of transcriptional level on marbling fat in livestock. In previous studies, the differentially expressed genes (DGEs) in longissimus dorsi muscle (LDM) samples between Wagyu and other breeds of beef cattle have been reported. In this study, we further investigated the differences in alternative splicing in LDM between Wagyu and Chinese Red Steppes cattle. We identified several alternative splicing types including cassette exon, mutually exclusive exons, alternative 5′ splice site, alternative 3′ splice site, alternative start exon, and intron retention. In total, 115 differentially expressed alternatively spliced genes were obtained, of which 17 genes were enriched in the metabolic pathway. Among the 17 genes, 5 genes, including MCAT, CPT1B, HADHB, SIRT2, and DGAT1, appeared to be the novel spliced candidates that affect the lipid metabolism in cattle. Additionally, another 17 genes were enriched in the Gene Ontology (GO) terms related to muscle development, such as NR4A1, UQCC2, YBX3/CSDA, ITGA7, etc. Overall, altered splicing and expression levels of these novel candidates between Japanese black cattle and Chinese Red Steppes revealed by RNA-seq suggest their potential involvement in the muscle development and fat deposition of beef cattle.


INTRODUCTION
Alternative splicing or alternative RNA splicing is a method to create different proteins from the same strand of DNA in mammalian cell. The structure and function of protein depend on the sequence of amino acids, which is dictated by the matured messenger RNA (mRNA). Current high-throughput sequencing technology suggests that alternative splicing is more widespread than initially thought, and alternative splicing is likely to be involved in various across tissue types and developmental stages as well as among individuals and populations (1). Hence, alternative splicing is an important and ubiquitous regulatory mechanism to increase the informational diversity and functional capacity of a single gene during posttranscriptional processing, and it also provide a crucial bridge between the genome and the proteome.
The currently accumulated data have highlighted importance of alternative splicing in normal physiology and pathology in human, including cancer, immune and infectious diseases, and metabolic conditions (2). With the development of requirements and technology, many studies used different methods to resolve the alternative splicing on a genome-wide scale (3,4) including microarray (5), expressed sequence tag (EST) (6), and sequencing. Pan et al. (7) analyzed alternative splicing complexity in human tissues showing transcripts from ∼95% of multiexon genes undergo alternative splicing by combining mRNA-Seq with expressed sequence tag (EST)-complementary DNA (cDNA) sequence data. The data of 15 diverse human tissue and cell line transcriptomes indicated that 92-94% of human genes undergo alternative splicing, and most alternative splicing and alternative cleavage events vary between tissues (8). These results also demonstrate the universality and utility of alternative splicing in mammal, and to further understand the alternative splicing will be a crucial effective method in the study of formation of human disease or perfect economic traits in livestock.
In previous studies, a larger number of transcriptomes were performed to analyze the genetics and to improve the production traits in livestock; the differentially expressed genes (DGEs) in longissimus dorsi muscle (LDM) samples among various breeds of beef cattle have been reported (9)(10)(11)(12). However, the alternative splicing data on a genome-wide scale was rarely analyzed and studied in depth. In the present study, we analyzed the differences in expression levels and components of alternative splicing in LMD of Wagyu and Chinese Red Steppes cattle by RNA-seq, and the candidate functional alternative splicing types of genes related to meat traits were provided. The study allowed better genomic characterization of the cattle in terms of transcripts variability and analyzed the potential effect of alternative splicing of genes on muscle development and lipometabolism. It may provide a theoretical basis for the identification of functional transcripts and a potential genetic regulatory elements for the study of genetic mechanism of meat quality traits in beef cattle.

Ethics Statement
Animal care and experiments were performed according to the guidelines established by the care and use of laboratory animals of the Jilin University Animal Care and Use Committee (Approval ID: 20140310).

Animal and Samples
The three Japanese black cattle (28 months old) were randomly chosen from 20 males born from cows that were artificially inseminated with semen stocks from the same bull. The three Chinese Red Steppes cattle (28 months old) were similarly chosen from 20 males with a common father. Three Japanese black cattle and three Chinese Red Steppes cattle were provided by the National Research Center for Animal Transgenic Biotechnology, Inner Mongolia University (Hohhot, China) and the Branch of Animal Science, Jilin Academy of Agricultural Sciences (Gongzhuling, China), respectively. The two farms housing the two groups of cattle were located at similar altitudes with similar natural weather conditions. The cattle in both groups were raised under similar normal conditions on a diet of corn and hay with access to feed and water ad libitum.
The marble scores of the 12-13 rib eye meat of these two breeds were accessed according to the classification of beef marbling grade standard. The marble scores of the three Wagyu cattle are 7, 7, and 6, which belong to the Grade 4 meat quality level (good grade). In the other group, the marble scores of the three Chinese Red Steppes cattle were 2, 2, and 2, respectively, which belonged to the Grade 2 meat quality level (below average grade). The tissue samples were transported in dry ice and stored in liquid nitrogen in the laboratory.

RNA Extraction and Quality Analysis
Total RNA was isolated from LDM using Trizol reagent (Invitrogen, USA), treated with DNase I (NEB, Beijing, China), extracted with phenol-chloroform, and precipitated with ethanol. The quality and quantity of total RNA were determined using an Agilent 2100 Bioanalyzer (Agilent technologies, Palo Alto, CA). The mRNA was purified from total RNA using poly-T oligoattached magnetic beads. From each sample, 1.5 µg of mRNA was used to construct six cDNA libraries for sequencing. The mRNA was treated with a Thermomixer (Eppendorf AG, Hamburg, Germany) to generate fragments with an average size of 200 bp (200 ± 25 bp) for the paired-end libraries. The fragmented mRNA was then used as templates for synthesizing the firststrand cDNA. The double-stranded cDNAs were purified and ligated to adaptors for Illumina paired-end sequencing. Library concentration was quantified by quantitative PCR (qPCR) and a Qubit R 2.0 Flurometer (Life Technologies, CA, USA), and the insert size was checked on an Agilent 2100 Bioanalyzer. The cDNA libraries were sequenced using the Illumina HiSeq2000 platform by the Beijing Genomics Institute, and 100 nt pairedend reads were generated. The raw data have been submitted to the Gene Expression Omnibus (GEO).

Analysis of Alternative Splicing and Prediction of Novel Transcripts of Genes
Clean high-quality reads were obtained from the raw reads by removing the adaptor sequences and low-quality reads. An RNA-Seq TopHat (v2.0.6) splicing junction mapper was used for RNA-Seq alignment (13). We used our own localized reads to calculate the number of reads for each gene and the read per kilobase and read per million (RPKM) values. Other statistical results were also analyzed, such as gene range and depth and read distribution around start and stop codons.
The reads were mapped to the reference genome (Bos taurus UMD_3.1.1) using the ultra-fast, short-read mapping program Bowtie (http://www.bowtie-bio.sourceforge.net) (13). The mapped reads were assembled using MAQ (14) to identify possible splicing junctions, the unmapped reads are split into smaller segments to allow alignment to the reference genome, and splice junctions were defined by the seed expansion procedure. All junction reads must be at least 6 nt that exactly match each of the two adjacent areas of the potential junction sites, and all junction sites with N3 non-redundant reads in both group were filtered. Alternative splicing detector software based on junction reads (http://www.novelbio.com/asd/ASD.html) (15,16), which recognizes and detects differential alternative splicing exons between two groups of RNA-Seq data, was used. We then counted the number of junction reads that matched either the inclusion or exclusion isoforms of both samples and calculated the p-value based on the junction read count between the samples using Fisher's exact test. Alternate exon read coverage for the corresponding gene was also calculated on the sample, and a second p-value based on the alternative exon read coverage for gene read coverage between samples was calculated using Fisher's exact test. The adjusted p-values were obtained by combining the two p-values using a weighted arithmetic expression to assess the statistical significance of alternative splicing between samples. Differentially expressed alternatively spliced genes between Japanese black cattle and Chinese Red Steppes were considered as differentially expressed according to adjusted p < 0.05.

Gene Ontology and Kyoto Encyclopedia of Genes and Genomics Enrichment Analysis
Differentially expressed alternatively spliced genes were implemented by the Gene Ontology (GO) seq R package (17). GO was used to determine and compare the functions of the differentially expressed alternatively spliced genes as biological process, molecular function, and cellular component, with corrected p < 0.05 considered significantly enriched. Association of the genes with pathways was computed with the Kyoto Encyclopedia of Genes and Genomics (KEGG) (18)(19)(20). The pathway with a corrected p < 0.05 was considered as significantly enriched.

Identification of Types and Quantification of Transcripts
One microgram total RNA was used to synthesize cDNA with the Transcriptor first-strand cDNA synthesis kit (Roche, USA) according to the manufacturer's recommendations. The primers were designed using Primer Premier 6.0 software (Premier BioSoft, Palo Alto, CA, USA) (Supplementary Tables 1, 2). The types of transcripts were identified by PCR in a 25.0 µl reaction volume including 12.5 µl of 2× Taq PCR Master Mix (Vazyme, Nanjing, China), 0.5 µl of upstream and downstream primers (10 µM), 1.0 µl of cDNA, and 10.5 µl of ultrapure water. The PCR conditions were as follows: 95 • C for 5 min; followed by 30 cycles of 95 • C for 30 s, 57 • C for 30 s, and 72 • C for 1 min; and finally, 10 min extension at 72 • C. The PCR products (8.0 µl) were run on 2% agarose gels.
Quantification of expression levels of differentially expressed alternatively spliced genes was detected by real-time PCR. Realtime PCR was performed using PCRmax Eco 48 (PCRmax, Staffordshire, UK). The real-time PCR system was as follows: 5 µl FastStart Universal SYBR Green Master (ROX) (Roche), 1 µl cDNA, 0.2 µl primer-F (10 µM), 0.2 µl primer-R (10 µM), and 3.6 µl RNase-free water. Reactions were incubated at 95 • C for 10 min, followed by 40 cycles of 95 • C for 10 s and 60 • C for 30 s. β-Actin were used as reference genes, and relative expression levels were calculated using the 2 − Ct . The data represent mean ± SEM from at least three independent experiments. Statistical analysis was performed by Student's t test at a significant level of p < 0.05.

RNA-seq and Transcriptome Analysis of Chinese Red Steppes and Japanese Black Cattle in LDM
A total of 268,490,636 reads were obtained from six samples in LDM. After filtering out low-quality data, 244,238,430 clean reads were mapped to Bos taurus (UMD_3.1.1). High-quality maps of the two breeds were obtained, and the unique mapping rates ranged from 87.3 to 88.3%. The details of the sequencing data quality of mRNA are shown in Table 1. The distribution of the reads on the chromosomes is shown in Figure 1A, and gene structure analysis was performed for each sample (Figure 1B). The sequence data were submitted to the GEO with the accession number of GSE161967.
A total of 155 differentially expressed alternatively spliced events between two breeds were screened according to adjusted p < 0.05. The differentially expressed alternatively spliced events refer to 59 Cassette, 7 Cassette multi, 11 A3SS, 19 A5SS, 12 AltEnd, 37 AltStart, 5 IR, and 5 MXE (Figure 2B). One hundred forty-four genes corresponding to differentially expressed alternative splicing were found between Chinese Red Steppes and Japanese black cattle including TPM2, STRIP2, CPT1B, SYNPO2L, SIRT2, etc. The top 15 differentially expressed alternatively spliced genes are listed in Table 2.

GO and KEGG Enrichment of Differentially Expressed Alternatively Spliced Genes
Functional enrichments on differentially expressed alternatively spliced genes by GO analysis found that 814, 532, and 232 GO terms were enriched in biological processes, molecular functions, and cellular components, respectively. A total of 216 biological processes, 115 molecular functions, and 54 cellular components were significantly enriched (p < 0.   The top 15 pathways are listed in Figure 3B by the ascending order of p-value.

Identification of Alternative Splicing Types and Verification of Expression Levels of Transcripts
To further test the accuracy of our sequencing and prediction data, integrin subunit alpha 7 (ITGA7) and Y-box binding protein 3 (YBX3/CSDA) were validated by PCR and real-time PCR.
CSDA was predicted and identified two transcripts and the long transcript with a cassette of exon (Figures 4A,B). The expression levels of the transcript with a cassette of exon is higher in Japanese black cattle, and the long transcript expression levels were significantly lower than short in two breeds (p < 0.05) ( Figure 4C). ITGA7 was also identified as having two transcripts, and the type of alternative splicing is cassette (Figures 4D,E); moreover, the expression levels of genes were higher in Japanese black cattle, and the long transcript with a cassette of exon was lower than in Chinese Red Steppes (Figure 4F). The results indicated that the alternative splicing prediction based on RNAseq data was reliable.

DISCUSSION
In the present study, 12,241 alternatively spliced events corresponding to 5,295 genes indicate the ubiquity of alternatively spliced genes in the bovine genome, and cassette is the most abundant alternatively spliced types like other species of livestock; this distribution pattern is also similar to that of other animals reported previously (7,21,22). Comprehensively, these results suggest that animals might possess similar alternative splicing forms. Numerous alternative splicing events occur during cell differentiation and tissue maturation, suggesting that alternative splicing supports proper development. Although the mechanisms and outcomes of alternative splicing of individual transcripts are relatively well-understood in dairy and beef cattle, such as angiopoietin like 6 (ANGPTL6) (23), transmembrane protein 95 (TMEM95) (24), and calpain 3 (CAPN3) (25), the scopes and exact functions of this regulatory mechanism still remain to be investigated on novel transcripts and alternative splicing networks.
According to the KEGG enrichment, a total of 59 alternatively spliced genes were enriched in 138 pathways, of which 17 genes including phosphatidylinositol 4-kinase beta (PI4KB), diacylglycerol O-acyltransferase 1 (DGAT1), uridine-cytidine kinase 1 like 1 (UCKL1), NADH: ubiquinone oxidoreductase subunit C2 (NDUFC2), and D-aspartate oxidase (DDO) were enriched in metabolism pathways. The functional characterization of these genes is well-known in previous accumulated data. Therefore, we speculate that these genes also could mediate the formation of meat quality traits through the regulation of the related pathways by expression of different transcripts.
Although the differentially expressed alternative splicing in the same tissue of the same development stage should have totally similar alternative splicing, 155 differentially expressed alternatively spliced events between Japanese black cattle and Chinese Red Steppes were screened in the present study, which indicated that the corresponding genes or splicing events may be caused by the breed effect of fat deposition and muscle development capacity. Thus, understanding the function of the multiple transcripts and the effect of expression levels of each transcript such as DGAT1, CPT1B, MCAT, and hydroxyacyl-CoA dehydrogenase trifunctional multi enzyme complex subunit beta (HADHB) will be a key to study the formation of perfect quality meat traits in cattle.
The DNA sequence of CPT1B contains 8,755 bases, and it contains 19 exons and 18 introns reported in GenBank. CPT1B is reported as a key rate-limiting enzyme in the β-oxidation of fatty acids as well as playing an important role in regulating the decomposition and energy supply of fat (26,27). Our previous study also found the single-nucleotide polymorphisms (SNPs) in the gene associated with meat quality. Moreover, the data of this study showed CPT1B has an alternative 5 ′ splice site at exon 7, and the novel transcript was a high expression in Japanese black cattle. Therefore, the result indicated that the expression of different transcripts of CPT1B between the two breeds may be related to the capability in fat deposition, while there is few report on alternative splicing of this gene; thus, the specific functions and mechanisms of the two transcripts need further verification.
DGAT1 mediates triacylglycerol biosynthesis at the last committed step. It is widely known as the functional gene of milk fat and fatty acids contents (28)(29)(30), and also a gene could affect the fatty acids composition and fat content in the muscle and fat of beef cattle (31,32). Alternative splicing of the DGAT1 gene plays an important role in peanut triglyceride metabolism (33). Moreover, two alternative splicing types with different functions in triglyceride metabolism of DGAT1 gene have been found in the diatom Phaeodactylum tricornutum (34). In the present study, two transcripts of DGAT1 gene were also found in LDM; however, the structure and roles of alternative splicing are rarely reported in other studies in livestock; therefore, the functions of two alternative splicing types need further verification.
Although the present study found that many genes have novel transcripts with different expression levels, the specific genetic functions and regulation mechanisms of translated proteins by these mRNAs of the same gene on bovine muscle development, fat deposition, and meat quality traits still require further study.

CONCLUSIONS
In this study, we disclosed features of genome-wide alternative splicing in two breeds with significant differences in fat deposition ability and meat quality traits through comprehensive transcriptome analysis by RNA-seq. The results suggest that 5,295 genes correspond to alternative splicing in bovine genome. Moreover, we found that cassette is the most abundant alternatively spliced types like other species. The analysis of functional categories demonstrates that differentially expressed alternatively spliced genes, including MCAT, CPT1B, HADHB, SIRT2, and DGAT1, have been reported to have regulatory effects on meat quality traits in previous studies. The results indicated that different types of alternative splicing and regulatory networks constructed by them in this study might partially contribute to the significant variance in meat quality traits between Japanese black and Chinese Red Steppes cattle. Therefore, the study provided a new direction for revealing the regulatory mechanism of gene regulatory elements on fat deposition and muscle development, which is the effects of different alternative splicing of candidate functional genes on marbling trait of beef cattle.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://www.ncbi.nlm.nih. gov/genbank/, GSE161967.

ETHICS STATEMENT
The animal study was reviewed and approved by laboratory animals of the Jilin University Animal Care and Use Committee (Approval ID: 20140310).

AUTHOR CONTRIBUTIONS
ZZ and RY designed the study. LQ and PJ collected the samples. XF and HY collected the data. XF, LX, ZB, and WH conducted the bioinformatics analyses and wrote the manuscript. WH and YZ revised the manuscript. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
We are grateful to the Branch of Animal Science, Jilin Academy of Agricultural Sciences and National Research Center for Animal Transgenic Bio-technology, Inner Mongolia University for animal sampling. The authors thank Novel Bioinformatics Co., Ltd for the technical help in data analysis.