Integrated Small RNA Sequencing, Transcriptome and GWAS Data Reveal microRNA Regulation in Response to Milk Protein Traits in Chinese Holstein Cattle

Milk protein is one of the most important economic traits in the dairy industry. Yet, the regulatory network of miRNAs for the synthesis of milk protein in mammary is poorly understood. Samples from 12 Chinese Holstein cows with three high ( ≥ 3.5%) and three low ( ≤ 3.0%) phenotypic values for milk protein percentage in lactation and non-lactation were examined through deep small RNA sequencing. We characterized 388 known and 212 novel miRNAs in the mammary gland. Differentially expressed analysis detected 28 miRNAs in lactation and 52 miRNAs in the non-lactating period with a highly significant correlation with milk protein concentration. Target prediction and correlation analysis identified some key miRNAs and their targets potentially involved in the synthesis of milk protein. We analyzed for enrichments of GWAS signals in miRNAs and their correlated targets. Our results demonstrated that genomic regions harboring DE miRNA genes in lactation were significantly enriched with GWAS signals for milk protein percentage traits and that enrichments within DE miRNA targets were significantly higher than in random gene sets for the majority of milk production traits. This integrated study on the transcriptome and posttranscriptional regulatory profiles between significantly differential phenotypes of milk protein concentration provides new insights into the mechanism of milk protein synthesis, which should reveal the regulatory mechanisms of milk secretion.


INTRODUCTION
Milk protein is one of the best protein sources for humans (Anderson et al., 2004). It also affects milk-manufacturing properties such as cheese yields, milk coagulation time and curd firmness (Auldist et al., 2004;Wedholm et al., 2006). Improving milk protein yields and quality can increase the economic outcome of the dairy industry. It has been reported that the amount and compositions of proteins in milk are determined mainly by genetic factors (Auldist et al., 2004). The heritabilities of milk protein compositions were moderate to high in Dutch Holstein-Friesian cattle, ranging from 0.25 to 0.80 (Schopen et al., 2009). So far, several strategies, such as QTL mapping, candidate gene analysis, genome-wide association studies (GWAS), or next-generation sequencing (NGS) technologies (Georges et al., 1995;Gambra et al., 2013;Zhou et al., 2019), have been adopted to identify several causal genes and mutations associated with increased protein yield and composition. However, the synthesis and secretion of milk proteins involve complex physiological and biochemical processes. One of these mechanisms is related to the role of microRNAs (miRNAs), which need to be thoroughly examined.
MiRNAs are a class of small (18-24 nucleotide) RNAs that are involved in the regulation of gene expression by targeting messenger RNAs (mRNAs). The vast majority of miRNA genes are transcribed by the RNA polymerase II, which generates long primary transcripts (pri-miRNA) that contain a hairpin stem-loop structure (Lee et al., 2003). miRNAs are processed from double-stranded hairpin precursors by Drosha protein in the nucleus and Dicer protein in the cytoplasm. The final single-stranded mature miRNA hybridizes with the RNAinduced silencing complex (RISC) to undergo gene inhibition (Robb and Rana, 2007;Kim et al., 2009). Unlike other regulators, miRNAs exert highly complex and combinatorial regulations by targeting hundreds of mRNA transcripts (Shivdasani, 2006). Extensive research in the past decade indicates the involvement of miRNAs in various biological processes such as cell development, proliferation, differentiation and apoptosis (Johnson et al., 2005;Gu and Iyer, 2006;Zhou et al., 2007). Recently, miRNAs have been shown to play important roles in the milk secretion process through their altered regulation of genes involved in milk protein and fat synthesis (Wang et al., 2016;Cui et al., 2020). Fifty-six mammary miRNAs were reported to have significant differences in expression between the lactation and non-lactating periods (begins 60 days before the expected calving) in Holstein cows (Li et al., 2012b). Several miRNAs, such as miR-15a (Li et al., 2012a), miR-139 (Cui et al., 2017), miR-423-5p (Mahmoudi et al., 2015), miR-101b (Tanaka et al., 2009), miR-486 (Li et al., 2015a), miR-152 (Wang et al., 2014), miR-135 (Ji et al., 2015) and miR-138 (Li et al., 2012b), appear to affect milk protein synthesis by regulating key genes of protein synthesis pathways. Although the identification and characterization of miRNA in bovine have been reported (Li et al., 2012b;Le Guillou et al., 2014;Li et al., 2015b;Wang et al., 2016), to our knowledge, only a few studies describe miRNA profiles specific to the synthesis of milk protein in bovine. The inspiration of many miRNA studies in milk protein synthesis in bovine was extrapolated, some even from another biological process that was unknown in mammary tissue before (Li et al., 2015a;Cui et al., 2017). The real miRNA profiles specific to milk protein traits are limited in bovine.
In this study, the hypothesis was that miRNAs have potential roles in milk protein production of cattle. Using miRNA-seq and RNA-seq, we investigated the miRNA profiles of mammary glands from 12 Chinese Holstein cows with three high (≥3.5%) and three low (≤3.0%) phenotypic values for milk protein percentage in lactation and non-lactating period. We believe that the results from the integrated transcriptome analyses of miRNA, mRNA and GWAS signals will help us identify new miRNA related to milk protein, further enhancing our understanding of the mechanisms of milk protein synthesis.

Ethics approval and consent to participate
All animal experiments were performed following the recommendations in the Guide for the Care and Use of Laboratory Animals of China. The study protocol was approved by the College of Animal Science 98 and Technology, China Agricultural University (Permit Number: DK996).

Mammary Samples
The 12 multiparous (second to fourth parities) and healthy Chinese Holstein cows with three too high and three low phenotypic values for milk protein percentage peak and nonlactation period were chosen from the Beijing Sanyuan Dairy Farm Center (Beijing, China), which has been described in the previous study (Li et al., 2016). In brief, the cows were maintained in free stall housing and were fed a total mixed ration (TMR) with ad libitum access to water. We defined a high milk protein percentage group as those cows with 3.5% protein and the low milk protein percentage group was composed of cows with 3.0% protein throughout the previous lactation based on Dairy Herd Improvement system (DHI) data (Supplementary Table S1). Six cows were selected at approximately 79 days postpartum (peak lactation) and the other six cows during the non-lactating period (∼30 days before the expected calving).
All mammary samples were retrieved using a biopsy, which was performed according to the method of with modifications. Briefly, the skin of the selected biopsy site was first shaved and disinfected with ethanol (75%). Then, the site was anesthetized with SU-MIAN-XIN and injected subcutaneously with 1 ml of procaine. A 1.5-cm incision was made in the skin at the midpoint of a rear quarter of the mammary gland and connective tissue using shears and tweezers, which was blunt-dissected away exposing the secretory gland capsule. Mammary tissue biopsy (∼500 mg) was then obtained and immediately placed in liquid nitrogen and subsequently stored at −80°C until RNA isolation. The suture was tied as the cannula was removed and pressure applied to reduce the collection of blood under the skin. Immediately after the experiment, all 12 cows received antibiotic prophylaxis and anti-inflammatory therapy.

RNA Extraction and Library Preparation for Small RNA Sequencing
Total RNA was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA, United States). Twelve small RNA libraries from RNA integrity and concentration were assessed using the RNA Nano 6000 Assay Kit of the Bioanalyzer 2,100 System (Agilent Technologies, CA, United States). All RNA samples had an RNA integrity number of at least 7.5. Fifteen percent agarose gels separated the total RNA to extract the small RNA (18-30 nt). After precipitation by ethanol and centrifugal enrichment of small RNA samples, the library was prepared according to the method and process of Small RNA Sample Preparation Kit (Illumina,. The RNA concentration of the library was measured using Qubit ® RNA Assay Kit in Qubit ® 2.0 for preliminary quantification and then diluted to 1 ng/μl. Insert size was assessed using the Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, United States). After the insert size consistent with expectations, the qualified insert size was accurate quantitative using the Taqman fluorescence probe of AB Step One Plus Real-Time PCR system (Library valid concentration >2 nM). The qualified libraries were sequenced by an Illumina HiSeq 2500 platform and 50-bp single-end reads were generated.

Identification of Small RNAs
Quality trimming and adaptor removal of the Illumina reads were carried out using Cutadapt 2.8 and Trimmomatic 0.36 (Martin, 2011;Bolger et al., 2014). After filtering for their size (18-30 nt), the cleaned reads were categorized into unique tags and then mapped to the bovine (ARS-UCD 1.2) reference genomes by Bowtie 1.1.1 with one mismatch (Langmead, 2010). All the downstream analyses were based on the mapped small RNA tags.
The matching sequences ranging from 18 to 30 nt were used to align against mirbase 22.0 (http://www.mirbase.org/) to identify known miRNAs by miRDeep2 with a quantifier. pl module (Mackowiak, 2011). The sequences matching other small RNAs, including rRNA, snRNA, repeat RNA, tRNA and snoRNA, were compared with Bos taurus noncoding RNA sequences in the Sanger RNA family database (Rfam 12.1) using Infernal 1.1 (Griffiths-Jones et al., 2003;Nawrocki and Eddy, 2013). Unannotated sequences combined with the known miRNA annotation from Ovis aries, Capra hircus, Sus scrofa, Mus musculus, and Homo sapiens were used to predict the novel miRNAs according to the characteristic hairpin structure of miRNA precursors by miRDeep2 core module miRDeep2. pl. The miRNAs expressed in at least two samples were considered as novel miRNAs. To make small RNA mapped to unique annotation, we followed the priority rule: known miRNA > rRNA > tRNA > snRNA > snoRNA > repeat > novel miRNA > ta-siRNA.

Differential Expression Analysis
Differentially expressed (DE) miRNAs between high and low milk protein percentage during peak and non-lactating periods (i.e., HP vs. LP, HD vs. LD) were investigated using the DESeq2 R package (Love et al., 2014). RNA-Seq read counts were modeled by a generalized linear model considering the experimental design, with two phenotypes (high milk protein percentage and low milk protein percentage) and two stages of lactation (peak lactation and non-lactating period). The model for the HP vs. LP and HD vs. LD comparisons only included the phenotype factor. The statistical power of this experimental design was estimated using the SSPA R package (Iterson et al., 2013), which reached 0.63 and 0.76 for the HP vs LP and HD vs. LD, respectively (Supplementary Figure S1). MiRNAs with a p-value <0.05 and log 2 (fold change)| >0.8 were assigned as DE (Zheng et al., 2015). The expression patterns of DE miRNAs across four groups were performed using the k-mean method (Ahmad and Dey, 2007). Using gap statistics, we determined that k 7 was the optimal choice for distinguishing these miRNAs (Supplementary Figure S2).

MiRNA Function Prediction and Regulatory Network Construction
We predicted the binding of DE miRNAs to the putative targets using miRanda V3.3a with score ≥50 and energy ≤ −20 kcal/mol (Enright et al., 2003). The predicted target genes were combined with the previous transcriptome profiling data (Li et al., 2016). We identified the correlations between miRNA and target genes in expression using an in-house R script. Briefly, the expressions of miRNA and their targets were sample-matched for all samples. Then for each miRNA, Pearson correlation coefficients were computed for all its targets; only targets significantly (p-value <0.05) and inversely correlated with miRNAs in expression were obtained. To evaluate the miRNA-gene regulatory network, GO term and KEGG pathway enrichment analyses were used to investigate putative functions of target genes using DAVID (https://david.ncifcrf.gov/) (Huang et al., 2007). The statistical significance of GO term or KEGG pathway enrichment was measured by Fisher's exact test with p-value <0.1.

Regulatory Network Construction
We selectively analyzed the DE miRNA-DE mRNA pairs that the targets of DE miRNA were also in DE either HP vs. LP or HD vs. LD. After the functional annotation, miRNAs and their targeting genes and pathways were subjected to the network visualization analysis. The Cytoscape 3.2.1 software was used to construct the network (Smoot et al., 2011).

The Enrichment Analysis of Genome-Wide Association Studies Signals
We obtained summary statistics of single-trait GWAS for five milk production traits (milk yield, milk protein yield, milk fat yield, milk protein percentage, and milk fat percentage), heifer conception rate, and somatic cell score (SCS) in cattle, as described previously (Jiang et al., 2019). Here we provided a summary. The de-regressed PTAs (predicted transmitting abilities) were used as a phenotype in all seven traits. The imputation phase was from Run 5 of the 1000 Bull Genomes Project (Daetwyler et al., 2014). After sequence marker imputation and quality control, genotypes of 3,148,506 Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 726706 sequence variants for 27,214 Holstein bulls were obtained. The single-trait GWAS analyses were conducted using a mixed-model approach by MMAP (https://mmap.github.io/). To check whether the SNP effects were more enriched in candidate feature than background regions, we applied a sumbased method for GWAS signal enrichment analysis. The sumbased method uses signals of all markers within a predefined candidate feature. Briefly, we calculated the following summary statistics for candidate feature: in which T sum is the summary statistics for a tested feature group, is the number of SNPs located in the candidate feature and β is the estimate of marker effect obtained from the GWAS summary statistics. Using Eq. 1, we calculated the T sum for candidate feature.
To perform the permutation test in background feature, we first assumed the SNP markers of the GWAS summary statistics located in background regions which were numbered as 1 . . . M. The observed SNPs located in the prior feature were N 1 , N 2 , N 3 , . . ., N n . Their test statistics were β 2 N1 , β 2 N2 , β 2 N3 , . . ., β 2 Nn . We chose number R within 1 ∼ M. Then, the observed SNP set was shifted to the new rank order (P 1 , P 2 , P 3 , . . ., P n ) based on random number R using the following formula: All test statistics were moved to the new positions, with the remaining markers maintaining the original order. A new summary statistic of a background region (β 2 P 1 , β 2 P 2 , β 2 P 3 , . . ., β 2 P n ) was calculated based on the original position of the feature. The permutation in background regions was repeated 10,000 times for each studied candidate feature and an empirical p-value was then calculated based on one-tailed tests of the proportion of randomly sampled summary statistics larger than that observed using the following formula: where N over represents the times of the permutated T sum which was found larger than that of the candidate feature T sum . We corrected empirical p values for the multiple testing using the FDR method implemented in R (p.adjust function) and then considered FDR < 0.05 as significant. To avoid bias by the DGAT1 gene, we conducted the GWAS enrichment analysis by excluding SNPs located in the DGAT1 gene or 1-Mb upstream/ downstream extended region of the DGAT1 gene. This sumbased method for GWAS signal enrichment analyses using Perl scripts is available (https://github.com/WentaoCai/GWAS_ enrichment).

Overview Over Small RNA Sequencing
To study miRNAs in milk protein synthesis' complex process, we profiled miRNA expression differences between the high milk protein percentage and low milk protein percentage groups in both lactation and non-lactating period using small RNA sequencing. After trimming adaptor sequences and removing contaminated reads, an average of 23.0 million clean reads ranges from 22.4 to 23.9 million were generated. Then, we categorized them into unique tags; an average of 1.1 million unique tags was obtained (Supplementary Table S2). We separately mapped clean reads and unique tags to the bovine (ARS-UCD1.2) reference genomes. The mapping rates were about 90.0 and 74.4% using total clean reads and unique tags, respectively (Supplementary Table S3). The majority of the mapped reads ranged from 21 to 23 nt in length and the 22-nt small RNA was the most abundant ( Figure 1A). As expected, most reads were observed to match with the 3′-untranslated region (UTR) and 5′-UTR allocating miRNAs ( Figure 1B). These results confirm that the small RNA sequencing process is reliable in our study. The residual fraction of mapped reads not corresponding to miRNAs was distributed among a miscellanea of annotated regions, including rRNAs (14.76%), tRNAs (3.48%), snoRNAs (0.38%), snRNAs (0.78%) and repeats (0.04%) ( Figure 1C).

Identification of Known and Novel miRNAs
We identified 388 known miRNAs expressed in at least two samples, which accounted for 38.7% of all known bovine miRNAs deposited in miRbase 22.0 (transcripts per million, TPM >0.5).
Despite differences in sample characteristics, samples from the same group clustered together based on their miRNA expression profiles ( Figure 2A). The first principal component (PC1), explaining the most variance in miRNA expression, separates the samples by lactating stage. PC2 separates the samples by the phenotype of milk protein percentage. We also compared the miRNAs with the top greatest expression (top 20) in the mammary tissue at lactation and non-lactating periods (Supplementary Figure S3). The top expressed miRNAs in both of the two stages were the same except for miR-142 and miR-126, which were explicitly expressed higher in lactation and non-lactating period, respectively. The highest expression of miRNAs in lactation was miR-148a, while miR-143 was the most highly expressed in the non-lactating period. In addition, we characterized 297 novel miRNAs expressed in at least two samples, including 172 and 188 novel miRNAs identified in lactation and non-lactating period. Interestingly, we found 17 novel miRNAs expressed in all 12 samples (Supplementary Interestingly, we found that 14 DE miRNAs exhibited common expression level differences across the two comparison groups (Table 1; Figure 2D). The clustering heat map of all 66 DE miRNA expression profiles from HP vs. LP or HD vs. LD is shown in Figure 3A.

Target Gene Prediction of Differentially Expressed miRNAs
To better understand the function of DE miRNAs, putative target genes were predicted using the 3′-UTR sequence of mRNA by the miRanda software. We predicted 9,156 target mRNAs for the 28 DE miRNAs in HP vs LP and 10,045 target mRNAs for the 52 DE miRNAs in HD vs. LD. To identify target genes with high confidence, we performed correlation analysis between DE miRNAs and their target genes in the expression levels in the 12 mammary samples. The expression of target genes from the 12 similar samples was quantified by RNA-seq mentioned in our previous study (Li et al., 2016

Functional Annotation of Differentially Expressed miRNAs
To functionally classify the DE miRNAs, GO and KEGG enrichment analyses were performed for DE miRNAs' confident target genes in lactation and non-lactating period, respectively. Functional annotation showed that these 914 target genes of common DE miRNAs were significantly  enriched in 60 pathways and 123 GO terms. Many pathways were associated with protein synthesis, insulin secretion, mTOR signaling pathway, estrogen signaling pathway, insulin signaling pathway and GnRH signaling pathway. Many GO terms were involved in protein synthesis, such as protein transport, trans-Golgi network, metabolic process and protein serine/threonine kinase activity (Supplementary Table S7A).
For specifically DE miRNAs in lactation, their target genes were enriched in 18 pathways and 110 GO terms. The pathways were involved in the mTOR signaling pathway, TNF signaling pathway, leukocyte transendothelial migration and MAPK signaling pathway. The functional terms involved in protein synthesis were noticed for positive regulation of transcription, post-Golgi vesicle-mediated transport, mRNA 3′-UTR binding and ER to Golgi transport vesicle (Supplementary Table S7B). For specifically DE miRNAs in the non-lactating period, their target genes were enriched in 37 pathways and 205 GO terms. Several target genes were observed to be involved in the PI3K-Akt signaling pathway, metabolic pathways and mTOR signaling pathway. Their functional terms were associated with protein transport, transcription, vasculogenesis and positive regulation of gene silencing by miRNA (Supplementary Table S7C). All enriched KEGG pathways and the top 10 significant GO terms are shown in Figure 3B.
Trends in DE miRNAs in lactation or non-lactating period were examined using k-means clustering, which revealed that 66 DE miRNAs in either HP vs. LP or HD vs. LD could be divided into seven distinct clusters with differentially expression level pattern changes (Figure 4). Most clusters (such as clusters 1, 2, 5 and 7) revealed that the expression change pattern of miRNAs in HP vs LP was similar to HD vs LD. Functional annotation reveals that these 26 miRNAs with similar expression patterns were involved in a variety of biological processes, such as biosynthesis of antibiotics, metabolic pathways and TNF signaling pathway (Supplementary Table S8). Clusters 3 and 4 indicated that some miRNAs were stably expressed between HP and LP, while they were dynamically changed between HD and LD.

Regulatory Networks for Differentially miRNAs-mRNAs
To better understand the relationship between miRNAs and milk protein traits, we selectively analyzed the 214 miRNA-mRNA pairs. Both miRNAs and their targets were DE in HP vs LP or HD vs LD. We found that 22 DE miRNAs potentially regulated 24 DEGs which were involved in milk protein synthesis ( Table 2). For example, PSPH, as a phosphoserine phosphatase that functions as phosphotransferase that is responsible for the third and last steps in L-serine formation, was involved in the biosynthesis of amino acids, metabolic pathways and glycine, serine and threonine metabolism. The expression of miR-1 was negatively correlated with PSPH. FABP3, targeted by miR-146b and miR-185, influences fat and protein content in cattle (Kulig et al., 2013). Additional genes are listed in Table 2. The networks of candidate target genes involved in milk protein synthesis through various pathways are shown in Figure 5.

Differentially Expressed miRNA Genes are Enriched With GWAS Signals of Milk Protein Traits
To assess whether DE miRNAs were associated with GWAS signals, we applied enrichment analysis from GWAS for all correlated targets of DE miRNAs across five milk traits (milk yield, milk protein yield, milk fat yield, milk protein percentage and milk fat percentage), one reproduction trait (heifer conception rate) and one health trait (somatic cell sore, SCS). Since very few imputed SNPs were observed within the miRNA precursor regions due to their short lengths, the analysis included the flanking ±50 kb sequences of DE miRNA precursors to capture proximal SNPs in the regulatory regions. The background regions were all miRNA precursors in miRbase 22.0 and their flanking ±50-kb regions. We did not detect significant enrichments for milk production in both HP vs LP and HD vs LD (Supplementary Table S9a). However, after removing 1,737 SNPs close to DGAT1, significant (FDR < 0.05) enrichments were observed for the milk protein percentage trait in DE miRNA of HP vs LP, while the GWAS signals of the protein trait were enriched in DE miRNAs of HD vs LD (FDR < 0.1, Table 3). Next, the DE miRNAs were separated into upregulated and downregulated miRNAs based on their log2FC >0 (up) or <0 (down) across comparison groups. The milk protein GWAS signals were significantly more enriched in upregulated miRNAs than in downregulated miRNAs in HP vs LP, while the opposite results were found in HD vs LD (Supplementary Table S9B). These results suggested that the milk protein variations in the traits may be associated with the DE miRNA genes.

Target Genes of Differentially Expressed miRNAs are Enriched With GWAS Signals of Milk Protein Traits
To investigate the joint effect of genetic variations in miRNA targets on milk production traits in dairy cattle, we conducted the GWAS enrichment analysis using inversely and significantly correlated targets of DE miRNA. Only SNPs located in targets or in the 5 kb extended region of targets were included in calculating the squares of their effects. For comparison, 10,000 random SNP sets located in all annotated genes (Ensemble 95) or their 5 kb extended areas were generated. As shown in Figure 5A, the correlated targets of DE miRNAs were enriched with GWAS signals of the milk protein trait (FDR < 0.05, Supplementary Table S10A). After correcting the DGAT1 bias, significant (FDR FIGURE 4 | The expression pattern of DE miRNAs using k-means clustering. The 66 DE miRNAs in either HP vs. LP or HD vs. LD could be divided into seven distinct clusters with differentially expression level pattern changes. The y-axis represents the relative expression using mean normalization. Most clusters (such as clusters 1, 2, 5 and 7) revealed that the expression change pattern of miRNAs in HP vs. LP was similar to HD vs. LD.
Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 726706 8 < 0.05) enrichments were observed for all five milk production traits in HP vs LP. For the targets of DE miRNAs in HD vs LD, significant enrichments were kept for milk protein and SCS traits. Similar with the GWAS enrichment analysis of DE miRNA genes, we also found that the milk protein GWAS signals were significantly more enriched in targets of upregulated miRNAs than in targets of downregulated miRNAs in HP vs LP, while the opposite results were found in HD vs LD (Supplementary Table  S10B). In addition, we did not find any enrichments in either HP vs LP or HD vs LD for the GWAS signals of the heifer conception rate trait (Supplementary Table S10B).

DISCUSSION
In this study, we obtained a comprehensive landscape of miRNAs associated with milk protein in the context of miRNA profiles across 12 mammary tissue samples during two different stages of lactation. Importantly, we identified candidate miRNAs and networks related to milk protein by integrating miRNAs, transcriptome and GWAS signals. These findings provide valuable insights for lactogenesis and yield a suite of molecular breeding resources to optimize the content of milk proteins.
There were 388 known miRNAs expressed in our study, accounting for 38.7% of all known bovine miRNAs deposited in miRbase 22.0. A total of 297 novel miRNAs were detected in this study, which will considerably increase bovine miRNAs' repertoire. A weakness of this study is the lack of visual inspection for the biopsy sample, but we found that the top expressed miRNAs of our mammary biopsy sample were similar with those of previous studies (Jin et al., 2014;Billa et al., 2019). The differentially regulated expression patterns of miRNAs in mammary gland tissue underscore that the synthesis and secretion of milk protein involve a high level of posttranscriptional regulation of gene expression by miRNAs. The 14 DE miRNAs between high and low milk protein percentages across both lactation and non-lactating periods suggest that these miRNAs may partially regulate the functions of the same biological or physiological processes in the two periods.
After combining the target prediction with expression correlation analysis, we matched 1,685 inversely correlated targets that resulted in 2,468 miRNA-mRNA pairs for HP vs LP and 2,280 inversely correlated targets that resulted in 3,697 miRNA-mRNA pairs for HD vs LD. Functional annotation showed that these inversely correlated targets of common DE miRNAs across two stages were associated with mTOR signaling pathway, estrogen signaling pathway, insulin signaling pathway, and GnRH signaling pathway, implying that these miRNAs could be critical to factors that affect milk quality and yield. It should be noted that some of the common DE miRNAs in this study have been previously suggested to play essential roles in milk protein synthesis. For example, miR-152 negatively regulates DNA methyltransferase 1 (DNMT1), decreasing the global DNA methylation and increasing the expression of serine/threonine protein kinase Akt (AKT) and peroxisome proliferator-activated receptor gamma (PPARc) (Wang et al., 2014). These target genes of DE miRNAs, specifically for lactation, were involved in positive transcription, mRNA 3′-UTR binding, and ER to the Golgi transport vesicle. Also, miR-423-5p has been shown to regulate AMPK-gamma1 (AMPKc1) negatively. The 3′-UTR SNP of AMPKc1 was influential on the milk and protein yield traits. This mutation also changed target mRNA base-pairing to miR-423-5p, which implied that miR-423-5p plays an important role in milk metabolism pathways (Mahmoudi et al., 2015). These target genes of DE miRNAs, especially for the non-lactating period, were also associated with some milk protein metabolisms, such as PI3K-Akt signaling pathway, metabolic pathways and mTOR signaling pathway. For example, miR-486 directly downregulates PTEN gene expression, altering the expression of downstream genes, such as AKT and mTOR. miR-486 as a downstream regulator of PTEN is required for the development of the cow mammary gland (Li et al., 2015a).  The DE miRNA-DEG regulatory networks provided a comprehensive profile for understanding the mechanism of milk protein synthesis in cows. Twenty-two DE miRNAs which potentially regulated 24 DEGs associated with milk protein metabolism were identified. MiR-1 is a known suppressor involved in PI3K-AKT, mTOR, and NFκB pathways (Safa et al., 2020). miR-1 controls cholesterol synthesis and regulates mammary proliferation by targeting IGF1 and TBX3 in the sow's mammary gland (Lin et al., 2020). Here, we found that the expression of miR-1 was negatively correlated with PSPH, which is an insulinresponsive gene in bovine mammary that is involved in protein synthesis (Menzies et al., 2009). Besides, proteins encoded by PSPH are engaged in serine synthesis (Brearley et al., 2019;Xuan et al., 2020). miR-146b was upregulated in the mammary glands of the HP group, which was reported to be involved mainly in leukemia, epidermal growth factor receptor (EGFR) signaling, MAPK, and nuclear factor kappa-light-chainenhancer of activated B cells (NF-κB) signaling pathways (Mathews et al., 2004;Taganov et al., 2006;Xiang et al., 2014). Moreover, miR-146b was associated with mammary gland development and stem cell activity (Wicik et al., 2016). The expression of FABP3 was negatively correlated with miR-146b. SNPs within FABP3 have been reported to influence fat and protein content in cattle (Kulig et al., 2013). These findings indicated that the expression change in DEGs and DE miRNAs within networks might contribute to milk protein metabolism in cows.
We integrated DE miRNA genes and their correlated targets with GWAS signals of milk production traits using the sum-based marker-set test method, which has been demonstrated to have higher power or at least equal to most commonly used marker-set test methods in polygenic traits (Sorensen et al., 2017;Fang et al., 2018). Our analysis revealed significant enrichment of milk protein GWAS signals in proximity to the precursors and target genes of DE miRNAs, especially to DE miRNAs in lactation, which implied that the DE miRNAs of lactation were more associated with milk protein traits. The GWAS signals of heifer conception rate trait were not enriched in targets of milk protein-associated miRNAs, which could be explained by negative genetic correlations between milk production traits and reproduction traits (Strucken et al., 2012;Tiezzi et al., 2012). Previous studies have reported that the DEGs in non-lactating periods could help the mammary tissue prevent issues with inflammation and udder disorders (Li et al., 2016). Of interest, we found that the DE miRNAs of the nonlactating period were related to the SCS trait. The differences in enrichments of up/downregulated miRNAs between lactation and non-lactating period indicated that the miRNAs might have different patterns of regulation involved in milk-related activities.

CONCLUSION
This study integrated small RNA sequencing with RNA-seq in the mammary gland to detect genes/pathways associated with milk protein synthesis in cows. We provide genomic evidence that DE miRNA genes in lactation are significantly enriched with GWAS signals for milk protein percentage traits and that enrichments within DE miRNA targets are significantly higher than in random gene sets for the majority of milk production traits. Responsive miRNAs in the mammary gland played roles in the regulation of the milk protein synthesis and the dysregulation of overall metabolism, providing novel milk-biological insights into the genetic mechanisms. The results should further enhance our understanding of miRNA expression profiles associated with milk protein concentration, allowing us to develop more effective breeding strategies.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi. nlm.nih.gov/, PRJNA689373, https://www.ncbi.nlm.nih.gov/ geo/, GSM5511040.

ETHICS STATEMENT
The animal study was reviewed and approved by The College of Animal Science and Technology, China Agricultural University.

AUTHOR CONTRIBUTIONS
SZ, JS, and JL conceived and designed the study and revised the manuscript. WC and CL performed the small RNA related experiments, data analysis and drafted the manuscript. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
We appreciate the Beijing Dairy Cattle Center and Beijing Sanyuan Dairy Farming Center for providing the mammary samples of Chinese Holstein dairy cows.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.726706/ full#supplementary-material Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 726706 Supplementary Figure 1 | The statistical power of this experimental design. The points with green and red color represent the statistical power for HP vs LP and HD vs LD, respectively. When sample size is 3, the statistical power reached 0.63 and 0.76 for the HP vs. LP and HD vs. LD, respectively.
Supplementary Figure 2 | The optimization of the k-mean using gap statistics. We determined that k 7 was the optimal choice for distinguishing these DE miRNAs.