Novel lncRNA lncFAM200B: Molecular Characteristics and Effects of Genetic Variants on Promoter Activity and Cattle Body Measurement Traits

Skeletal muscle is one of the three major muscle types in an organism and has key roles in the motor system, metabolism, and homeostasis. RNA-Seq analysis showed that novel lncRNA, lncFAM200B, was differentially expressed in embryonic, neonatal, and adult cattle skeletal muscles. The main aim of this study was to investigate the molecular and expression characteristics of lncFAM200B along with its crucial genetic variations. Our results showed that bovine lncFAM200B was a 472 nucleotide (nt) non-coding RNA containing two exons. The transcription factor binding site prediction analysis found that lncFAM200B promoter region was enriched with SP1 transcription factor, which promotes the binding of myogenic regulatory factor MyoD and DNA sequence. The mRNA expression analysis showed that lncFAM200B was differentially expressed in embryonic, neonatal, adult bovine muscle tissues, and the lncFAM200B expression trend positively correlated with that of MyoG and Myf5 in myoblast proliferation and differential stages. To identify the promoter active region of lncFAM200B, we constructed promoter luciferase reporter gene vector pGL3-Basic plasmids containing lncFAM200B promoter sequences and transfected them into 293T, C2C12, and 3T3-L1 cells. Our results suggested that lncFAM200B promoter active region was from −403 to −139 (264 nt) of its transcription start site, covering 6 SP1 potential binding sites. Furthermore, we found a novel C-T variation, named as SNP2 (ERZ990081 in European Variation Archive) in the promoter active region, which was linked to the nearby SNP1 (rs456951291 in Ensembl database). The genotypes of SNP1 and combined genotypes of SNP1 and SNP2 were significantly associated with Jinnan cattle hip height. The luciferase activity analysis found that the SNP1-SNP2 haplotype CC had the highest luciferase activity, which was consistent with the association analysis result that the combined genotype CC-CC carriers had the highest hip height in Jinnan cattle. In conclusion, our data showed that lncFAM200B is a positive regulator of muscle development and that SNP1 and SNP2 could be used as genetic markers for marker-assisted selection (MAS) breeding of beef cattle.

iNTRODUCTiON Long non-coding RNA (lncRNA) is an important class of noncoding RNAs (ncRNAs), which are involved in a variety of biological processes. LncRNAs are usually greater than 200 nucleotide (nt) in length, mostly were transcribed by RNA polymerase II, and some were transcribed by RNA polymerase III. Similar to mRNAs, the expression of lncRNAs have obviously temporal (the same tissue on different development stages) as well as the spatial (different tissues) specificity. LncRNA gene has its own promoter, which can be recognized by specific transcription factors. In the last decade, lncRNAs have been showed to have multiple functions in many developmental processes, such as regulating gene expression by transcriptional, post-transcriptional, or epigenetic regulation Fernandes et al., 2019). Besides, lncRNAs can serve as the sponges for miRNAs to relieve the repression of miRNAs on their target genes . Although the biological functions of lncRNAs are very important, their sequence conservation is low among species. Thus, it is important to understand the role of novel lncRNAs in various biological processes in different species.
Skeletal muscles account for about 40% of human body weight, which are not only the dynamic part of the motor system but also play a key role in organism metabolism and homeostasis . Skeletal muscles are composed primarily of multinucleated myotubes, which were originally derived from myogenic progenitor cells (MPCs). MPCs are destined to become myoblasts, which subsequently turn into myotubes after proliferation, differentiation, and fusion . This process is regulated by a variety of transcription factors and epigenetic regulators such as the myogenic regulatory factors myogenic differentiation 1 (MyoD), myogenin (MyoG), myogenic factor 5 (Myf5), and myosin heavy chain 3 (MYH3) (Bharathy et al., 2013). Recently, with the rapid development of sequencing technology, an increasing number of studies found that lncRNA played a crucial role in the development of muscle (Yu et al., 2017;Zhu et al., 2017;Li et al., 2018). In cattle, the lncRNA sequencing showed that lncRNAs were crucial in muscle development (Billerey et al., 2014;Sun et al., 2016;Liu et al., 2017). Although the functions of some lncRNAs such as lncMD, lncYYW, and lnc133b in bovine muscle development have been identified, the roles of numerous lncRNAs are still mysteries to be explored Jin et al., 2017;Yue et al., 2017).
Muscle development is one of the main factors that affect cattle growth, and thus, ultimately influences the production economic benefits. Thus, this issue has attracted huge attention in the beef cattle breeding industry. Nowadays, marker-assisted selection (MAS) is a rapid and efficient breeding method, which is based on crucial genetic variation markers (Cui et al., 2018;Chen et al., 2019). Thus, finding muscle development associated genetic variation markers is very important for beef cattle MAS breeding. Given the important role of lncRNA, we think that it would be feasible to screen genetic variations in the muscle development associated lncRNAs region. Sun et al. (2016) using Ribo-Zero RNA-Seq identified the lncRNA landscape of bovine embryonic, neonatal, and adult skeletal muscles. Within these three developmental stages, 401 differentially expressed lncRNAs were revealed, which included lncMD and some new lncRNAs . In these newly identified lncRNAs, NONBTAT022788 was mapped to the first intron and the second exon (sequence identity is 100%) of Bos taurus FAM200B gene (NCBI Reference Sequence: AC_000163.1), thus we aptly renamed it as lncFAM200B. In this study, we focused on lncFAM200B as it was differentially expressed in bovine embryonic, neonatal, and adult skeletal muscle [the fragments per kilobase of exon per million fragments mapped (FPKM) of lncFAM200B were 15.72, 0.41, and 5.73, respectively]. Based on the RNA-Seq results, we speculated that lncFAM200B probably plays an important role in the development of bovine skeletal muscle.
Therefore, in this study, we investigated the sequence and expression characteristics of bovine lncFAM200B and further, we identified the functional genetic variations in lncFAM200B gene. These results would lay the foundation for the function research of lncFAM200B and provide scientific data for beef cattle breeding.

MATERiAls AND METhODs
All experiments in this study were approved by the Faculty Animal Policy and Welfare Committee of Northwest A&F University (no.NWAFAC1008). The care and use of experimental animals is in full compliance with local animal welfare laws, guidelines, and policies.

Animal Tissue samples Collection
To explore the expression profile of lncFAM200B, multiple tissue samples from Qinchuan steers at three different developmental stages: embryos of about 3 months old, newborns within 1 week, and adults of about 24 months old were collected from Shaanxi Kingbull Livestock Co., Ltd. (Baoji, China). For sampling at each of the developmental stages, three individuals were used. For each neonatal and adult individual, seven types of tissue samples were collected (heart, liver, spleen, lung, kidney, skeletal muscle, and fat tissue). For embryonic stage, only six kinds of tissue samples were collected (without fat). All samples were frozen immediately in liquid nitrogen and stored at −80°C.

Total RNA isolation, cDNA synthesis, and RACE Experiments
Total RNA was isolated from samples using TRIzol reagent (TaKaRa, Dalian, China). The quality of total RNA was evaluated by 1% agarose gel electrophoresis and NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Then PrimeScript ™ RT reagent Kit with gDNA Eraser (TaKaRa, Dalian, China) was used to synthesize complementary DNA (cDNA), which was used as template for quantitative reverse-transcription PCR (qRT-PCR) or full-length amplification of lncFAM200B.
Rapid amplification of cDNA ends (RACE) experiments were carried out to identify the full-length of bovine lncFAM200B using bovine fetus skeletal muscle cDNA as template.
The 3′ RACE was done using PrimeScript ™ RT reagent Kit (TaKaRa, Dalian, China) and 3′ RACE universal primers Q T , Q O , and Q I as described in Scotto-Lavino et al. (2006). The 5′ RACE was done using SMARTer ® RACE 5′/3′ Kit (Clontech, Palo Alto, CA, USA) according to the user manual and the previous study . The 3′ RACE and 5′ RACE specific primers for lncFAM200B were designed based on the sequence obtained from RNA-Seq (Table 1). Then the full-length of bovine lncFAM200B was obtained through sequences assembly based on the results of 3′ and 5′ RACE.

The sequence Features Analyses and Functional Prediction of Bovine lncFAM200B
The coding potential was predicted on Coding Potential Calculator (CPC) website (Kong et al., 2007). The known protein-coding genes CCAAT enhancer binding protein alpha (C/EBPα) and lncRNA H19 imprinted maternally expressed transcript (H19) were also calculated as control. NCBI-Open Reading Frame Finder (ORF Finder) was used to analyze the open reading frame (ORF) of lncFAM200B. The prokaryotic expression system was used to detect the protein coding ability of lncFAM200B. The full length of bovine lncFAM200B and enhanced green fluorescent protein (EGFP) were cloned into vitro prokaryotic expression system pET-28a vector using XhoI and HindIII restriction enzymes and In-Fusion ® HD Cloning Kit (TaKaRa, Dalian, China) . The miRDB (http://www.mirdb.org/) was used to predict the interacting miRNAs, and AliBaba2.1 (http://gene-regulation. com/pub/programs/alibaba2/index.html) was used to predict the transcription factors that may bind to the promoter region of lncFAM200B.

Quantitative Reverse-Transcription PCR
The qRT-PCR was performed to detect the expression of lncFAM200B in tissues. The housekeeping gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as internal control. The primers for qRT-PCR were listed in Table 1. The qRT-PCR was performed in a Bio-Rad CFX Manager 3.1 (Bio-Rad Laboratories, Hercules, CA, USA) using SYBR ® Premix Ex Taq ™ II (Tli RNaseH Plus) (TaKaRa, Dalian, China) (Kang et al., 2019a). All samples were detected in triplicate. The relative expression levels of mRNA in tissue samples were calculated using the 2 −∆∆Ct method (Livak and Schmittgen, 2001). The correlations between genes were calculated using Pearson correlation analysis, and the differences between samples were calculated using Student t-test .

Cell Culture, Plasmids Construction, and Transfection
The procedure for separating bovine myoblast from skeletal muscle was the same as the previous study of our lab . Then cells were cultured in incubator at 37°C with 5% CO 2 . The proliferation medium for myoblast contains 80% Dulbecco's Modified Eagle Medium (DMEM), 20% fetal bovine serum (FBS), penicillin (10 U/ml), and streptomycin (10 mg/ml). When myoblast start to fuse, the proliferation medium was replaced by differential medium, which contains 2% horse serum, penicillin (10 U/ml), streptomycin (10 mg/ml), and DMEM. The RNA of the myoblast was collected using TRIzol reagent (TaKaRa, Dalian, China) at proliferation and differential stages. Mouse C2C12 myoblast cells, mouse 3T3-L1 embryo fibroblast, and human embryonic kidney 293T cells were used to uncover the active region of lncFAM200B promoter or single nucleotide polymorphisms (SNPs) effects on promoter activity. They were grown in 10% FBS, 90% DMEM, penicillin (10 U/ml), and streptomycin (10 mg/ml) medium.
The plasmids were transfected into cells using Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA). Before transfection, cells were seeded into 96-well plate. When cells covered 80% of the culture plate bottom, the plasmids were transient transfected according to the manufacturer's protocol. To normalize the transfection efficiency, the pRL-TK was transfected with constructed plasmids, and the transfection ratio of constructed plasmids and pRL-TK was 50:1 (Kang et al., 2019b). All transfections were carried out in triplicate. After 36 h, the cells were lysed, and the luciferase activity was measured using BHP9504 microporous-plate luminescence analyzer (Hamamatsu Photons Technology, Beijing, China). The relative luciferase activity of different promoter fragments were normalized by renilla luciferase activity Kang et al., 2019b). The relative luciferase activity was represented by mean ± standard deviation. The one-way ANOVA and Bonferroni multiple comparisons were used to analyze the difference between groups .

Genetic Variation Analyses of Bovine lncFAM200B Promoter Region
A total of 352 female cattle from four breeds were used in this study to identify the novel genetic variations in bovine lncFAM200B promoter region. The samples of Qinchuan cattle (n = 139), Jinnan cattle (n = 121), Nanyang cattle (n = 67), and Ji'an cattle (n = 25) were randomly collected from Shaanxi, Shanxi, He'nan, and Jiangxi provinces, respectively. The detailed information and records of body measurement traits for the cattle were the same as the published papers (Zhang et al., 2015;Jin et al., 2018). The blood DNA samples were isolated using high salt-extraction method (Aljanabi and Martinez, 1997). The primers (SNP-F and SNP-R) used to identify the genetic variations were designed based on the DNA sequence of bovine lncFAM200B gene. All the variations were identified by agarose gel electrophoresis and DNA sequencing (Sangon Biotech, Shanghai, China). After genotyping, the genotypic and allelic frequencies, population genetic diversity indexes [Hardy-Weinberg equilibrium (HWE), heterozygosity (He), effective population size (Ne), polymorphism information content (PIC)] were calculated according to the methods described as Nei (1973) using MSR website (http://www.msrcall. com/) Yang et al., 2017). Then the association analyses between genotypes and records of body measurement traits were performed based on the reduced linear model below: Y i = u + G i + e, where Y i was the trait measured data for each animal; u was the over mean for each trait; G i was the effect of genotype; and e was the random error. Different breeds were analyzed separately. Due to all the cattle were 2−3 years old female and the individuals of the same breed were bred in the same farm, so this model excluded the farm, breed, years old, and sex factors. The linkage disequilibrium (LD) and haplotypes analyses were performed using SHEsis online platform (http://analysis.biox. cn 1 ; Cui et al., 2018). The association analyses between genotypes or haplotypes and body measurement traits were performed by one-way ANOVA followed by Bonferroni multiple comparison (three groups) or independent-sample t-test (two groups) .

Characterization of Bovine lncFAM200B
Due to only partial sequence (369 nt) was obtained by RNA-Seq , the 5′ and 3′ RACE were carried out to obtain the full length of lncFAM200B. The 3′ and 5′ RACE obtained 174 bp and 323 bp sequences, respectively (Figure 1). The fulllength of bovine lncFAM200B was 472 nt and had two exons (Figures 2A, B). The protein-coding potential prediction score of bovine lncFAM200B in CPC was −1.22524, which was far less than the scores of the known protein-coding genes C/ EBPα and lncRNA H19 ( Figure 2C). Meantime, all the ORFs in lncFAM200B were smaller than 100 amino acids, illustrated that the coding ability of lncFAM200B was very low . To ensure the coding ability of lncFAM200B, the prokaryotic expression system was implemented and it showed that no protein was being encoded by lncFAM200B (Figure 2D).
The miRNA prediction analysis uncovered that 8 miRNAs might interact with lncFAM200B. Among these miRNAs, 5 miRNA scores were above 60, so we further predicted the target genes of these 5 miRNAs. As a result, some cell proliferation associated genes were uncovered, such as insulin like growth factor 2 mRNA binding protein 2 (IGF2BP2) (Figure 2E). Furthermore, as it is known that few lncRNAs could interact with their nearby genes, we searched the adjacent genes of lncFAM200B. Interestingly, we found that fibroblast growth  factor binding protein 1 (FGFBP1) was close to lncFAM200B. Thus, lncFAM200B might interact with FGFBP1 and affect cell proliferation and differentiation (Xie et al., 2006). The transcription factors binding sites prediction analysis found that within the 3000 bp sequence region upstream of lncFAM200B, there were 30 C/EBPα, 7 CCAAT/enhancer binding protein beta (C/EBPβ), and 43 SP1 transcription factor binding sites. Hayashi et al. (2016) found that the area enriched with SP1 was highly prone to promote the binding of MyoD and DNA sequence. Since the MyoD was a crucial transcription factor during muscle cell differentiation, we think that the identified region must be important for the transcription of bovine lncFAM200B.

Expression Profiles of lncFAM200B in Bovine Tissues and Myoblasts
To reveal the function of lncFAM200B, we investigated the expression profiles in bovine embryonic, neonatal, and adult tissues. In various bovine tissues, lncFAM200B was widely expressed in three developmental stages (Figures 3A-C). In skeletal muscle, the expression level of lncFAM200B was low at each state, but was significantly different among the three developmental stages (Figure 3D), which was consistent with the RNA-Seq data. At the cellular level, we detected the expression level of lncFAM200B, MyoD, MyoG, Myf5, and MYH3 genes in myoblast proliferation and differential stages, which were important in the regulation of myoblast development (Figure 4). The expression characteristic of lncFAM200B showed a significant positive correlation with the expression of MyoG (Pearson correlation coefficient = 0.922, P = 0.003) and Myf5 (Pearson correlation coefficient = 0.741, P = 0.035) ( Table 2). These results suggested that lncFAM200B might be involved in the development of bovine myoblasts.

identification of Bovine lncFAM200B Promoter Active Region
Considering the characteristic of lncFAM200B promoter region, this study further confirmed the promoter active region of bovine lncFAM200B. Six truncated fragments of the promoter region were constructed into pGL3-Basic plasmid and transfected into 293T, C2C12, and 3T3-L1 cells. By restriction enzyme identification and plasmids sequencing analyses, we confirmed that the recombinant plasmids were successfully constructed ( Figure 5). The detection of double luciferase activity showed that the luciferase activity of different truncated fragments showed the same trend in these three different cell lines ( Figure  6D). In each cell line, the luciferase activity of positive control (pGL3-Control) was high, but the negative control (empty pGL3-Basic) was low (Figures 6A-D), providing the basis for our observations and correct experimental design. The pGL3-pro2, pGL3-pro3, and pGL3-pro4 yielded a significantly stronger luciferase activity compared to the other vectors (P < 0.01; Figures 6A-C), which suggested that these fragments contained promoter active region. The luciferase activity of the longest fragment pGL3-pro1 was lower than that of pGL3-pro2, pGL3-pro3, and pGL3-pro4 (Figures 6A-C), suggesting that there might be inhibitor binding sites in the region (−2,446 to −1,653) of the lncFAM200B. Particularly, from pGL3-pro4 to pGL3-pro5, the luciferase activity dramatically decreased (P < 0.01; Figures 6A-C), which meant that the active region was truncated in pGL3-pro5 and the active region was from −403 to −139 (264 nt) of the lncFAM200B transcription start site ( Figure 6D). Besides, upon the transcription factor binding site prediction, we found 6 SP1 and 2 C/EBPα potential binding sites in the active region (−403 to −139) ( Figure 6E). Above

Novel Genetic Variations in Bovine lncFAM200B Promoter Region
Promoter active region is very important for gene expression, hence we wanted to know whether there are crucial genetic variations in this region. Based on the DNA sequencing results, two SNPs were revealed in the promoter region of bovine lncFAM200B, SNP1 (NC_037333.1:g.110851632 C-A, rs456951291 in Ensembl database) and a novel genetic variant SNP2 (NC_037333.1:g.110851751 C-T, ERZ990081 in European Variation Archive) (Figure 7). Interestingly, SNP2 was in the promoter active region of bovine lncFAM200B. At SNP1 locus, CC and CA genotypes were identified in cattle (three genotypes were identified in Jinnan cattle). At SNP2 locus, only CC and CT were identified in the four detected cattle breeds (Table 3; FiGURE 5 | The products of lncFAM200B promoter fragments and the identification of the plasmids using different restriction enzymes. (A) P1 to P6 represented the PCR products of pGL3-pro1 to pGL3-pro6. (B) P1 to P6 represented the recombined plasmids of pGL3-pro1 to pGL3-pro6 digested by different restriction enzymes.  . At these two loci, C was the main allele in all the detected cattle breeds. The Chi-squared test showed that these loci were at Hardy-Weinberg equilibrium (P > 0.05) in the four populations (Table 3). Further, population genetic parameters indicated that the loci were polymorphic but belonged to low (PIC < 0.25) or moderate (0.25 < PIC < 0.50) polymorphisms categories ( Table 3). Then LD analyses between SNP1 and SNP2 were analyzed in Qinchuan, Jinnan, and Ji'an populations ['in Nanyang cattle the individual numbers of CA (SNP1 locus) and CT (SNP2 locus) were found to be smaller than 3, so we did not perform the LD analysis and the follow association analysis]. The D' and r 2 values in Qinchuan (D' = 1.000, r 2 = 0.735), Jinnan (D' = 0.611, r 2 = 0.049), and Ji'an (D' = 0.857, r 2 = 0.532) cattle populations showed these two loci were linked in cattle. The r 2 reflects the extent of the linkage disequilibrium and r 2 > 0.33 indicated that there was a sufficiently strong linkage between the two loci. When different genotypes are evenly distributed in the population, the D' > 0.33 can also be used to judge that there was a linkage disequilibrium .  The association analyses found that the genotypes of SNP1 were significantly associated with the hip height in Jinnan cattle (P = 0.012). The hip height of the CC genotype carriers was 131.7 ± 6.7 cm, which was evidently higher than that of CA (128.6 ± 6.5 cm) and AA (127.1 ± 5.4 cm) genotype carriers, but we did not observe any significant difference between CA and AA genotype carriers (Figure 8). Besides, at SNP1 and SNP2 loci, the body measurement traits (hip height, body height, body length, heart girth, rump length) of CC genotype carriers were all better than the carriers with the other genotypes in Jinnan cattle (Figure 8). Furthermore, the combined genotypes of SNP1 and SNP2 were found to be significantly associated with hip height in Jinnan cattle (P = 0.033). The hip height of the CC-CC carriers (132.0 ± 6.6 cm, n = 44) was markedly higher than that of CA-CT (127.7 ± 7.9 cm, n = 15), CA-CC (128.9 ± 6.0 cm, n = 43), and AA-CC (127.1 ± 5.4 cm, n = 17) genotype carriers (Figure 9). Because we only found one individual with CC-CT and one individual with AA-CT genotype, they were excluded in association analyses ( Table 4). In Qinchuan and Ji'an cattle, no significant association was found between SNP1, SNP2, or the combined genotypes and the body measurement traits.
influence of the haplotypes on the Transcriptional Activity of Bovine lncFAM200B Bearing in mind the significant relationship between SNP1 and the combined genotypes with the cattle body measurement traits, we wanted to further investigate the mechanism that contributed to the phenotype. Four plasmids (pGL3-CC, pGL3-CT, pGL3-AC, pGL3-AT) of SNP1 and SNP2 haplotypes were constructed and transfected into commonly used 293T cells to detect the luciferase activity. The luciferase activity of positive control (Control) was significantly higher compared to that of the negative control (empty Basic) and we found that the relative luciferase activity of pGL3-CC was the highest among the four haplotypes in 293T cells. The luciferase activities of pGL3-CC and pGL3-AT were significantly higher than that of the pGL3-CT haplotypes (P < 0.05). But no difference was found among the other haplotypes (Figure 10). These results suggested that the genotypes of SNP1-SNP2 haplotypes influenced the body measurement traits by regulating the expression of lncFAM200B.

DisCUssiON
With the rapid development of high-throughput sequencing technology, an increasing number of lncRNAs have been discovered in many animal species. Structurally, the lncRNA resembled protein-coding gene with its own promoter, exons, and introns. The lncFAM200B was screened from the sequencing results obtained in an earlier study done by Sun et al. (2016). In their study, they implemented strict parameters to identify the lncRNA from the sequencing results such as the number of exons must be ≥2, the size must ≥200 nt, the read number should  be >3, the ORF should be no longer than 100 amino acids, and the predicted protein-coding potential should be weak Sun et al., 2016). Based on their research, we used different methods (RACE, in vitro prokaryotic expression system, and protein-coding ability prediction analysis) to further prove that lncFAM200B was a novel lncRNA. Expression analysis found that the expression of lncFAM200B positively correlated with the expression of MyoG (P = 0.003) and Myf5 (P = 0.035). MyoG, a muscle-specific transcription factor, positively regulated the skeletal muscle fiber development, myoblast differentiation, and fusion, and was found to be indispensable for myogenic differentiation (Zammit, 2017). Myf5 is a master regulator belonging to the MRFs family and is known to play a key role in muscle differentiation or myogenesis. Myf5 is a master gene for the determination of skeletal muscle, which pushes the myogenic precursors into myoblasts (Dimicoli-Salazar et al., 2011). The genes have the same expression pattern may have the same function, such as MEGF10, a myogenic regulator of satellite cells in skeletal muscle, shares a similar expression pattern with MyoG in muscle regeneration (Park et al., 2014). Thus, we hypothesize that lncFAM200B might play a positive role in muscle development.
The molecular markers based on nucleotide sequence variations among individuals, which are the directly reflection of genetic polymorphism in DNA level. Compared to the morphological markers, DNA molecular markers have many advantages. Genomic variations are extremely abundant and are the impetus of biological evolution providing rich material for animal breeding. At different stages of biological development, such as the early disease diagnosis and early animal selection for breeding, the DNA markers can be used. The detection method of DNA genetic variations is simple and rapid. Nowadays, DNA markers are widely used in biological evolution analysis, genetics analysis, diagnosis of genetic diseases and so on (Alidoust et al., 2018). In animal breeding, it is important to explore crucial markers. In cattle, numerous variations have been identified within the protein-coding genes, but only a few studies have uncovered the variations in the non-coding RNA genes (Jin et al., 2018;Yu et al., 2018). In this study, first, we analyzed the SNPs in the promoter region of lncFAM200B gene and found that the SNP1 was linked with the promoter active region mutation, SNP2. Importantly, the genotypes of SNP1 and combined genotypes of SNP1 and SNP2 were associated with the hip height in Jinnan cattle.
We attempted to uncover the cause of the above SNP effect on the cattle growth trait. Promoter regulates the activity of gene by affecting the binding of transcription factors and DNA promoter region sequences. Mutations in the gene promoter region will result in gene expression disorder, further resulting in phenotypic changes and disease (Lu et al., 2019). In this study, we used the dual-luciferase reporter system to detect the effects of SNP1 and SNP2 variations on gene expression. In the commonly used 293T cells, haplotype CC showed the highest fluorescence value followed by haplotype AT and both were significantly higher than haplotype CT. The haplotype CC had the highest hip height, which agreed with the luciferase activity data. These results further provided evidence proving that lncFAM200B is a positive regulator of muscle development.

CONClUsiON
The lncRNA lncFAM200B differentially expressed in embryonic, neonatal, and adult bovine skeletal muscles. In myoblast proliferation and differential stages, the expression characteristic of lncFAM200B was positively correlated with the expression of MyoG and Myf5. In lncFAM200B active region (−403 to −139 of lncFAM200B transcription start site), one novel SNP (SNP2, NC_037333.1:g.110851751 C-T, ERZ990081) was discovered which linked with the nearby SNP1 (rs456951291). The genotypes of the SNP1 and the combined genotypes of SNP1 and SNP2 were significantly associated with the hip height in Jinnan cattle. Interestingly, haplotype CC had the highest luciferase activity and the highest hip height. Our results established that lncFAM200B is a positive regulator of muscle development and we believe that our studies will help in advancing the beef cattle MAS breeding program.

DATA AVAilABiliTY sTATEMENT
The detailed information of SNP2 can be found in the European Variation Archive database after 2019/12/31. Project: PRJEB33081; Analyses: ERZ990081.