Identification of Genetic Loci Associated With Crude Protein Content and Fiber Composition in Alfalfa (Medicago sativa L.) Using QTL Mapping

Forage quality determined mainly by protein content and fiber composition has a crucial influence on digestibility and nutrition intake for animal feeding. To explore the genetic basis of quality traits, we conducted QTL mapping based on the phenotypic data of crude protein (CP), neutral detergent fiber (NDF), acid detergent fiber (ADF), and lignin of an F1 alfalfa population generated by crossing of two alfalfa parents with significant difference in quality. In total, 83 QTLs were identified with contribution to the phenotypic variation (PVE) ranging from 1.45 to 14.35%. Among them, 47 QTLs interacted significantly with environment and 12 QTLs were associated with more than one trait. Epistatic effect was also detected for 73 pairs of QTLs with PVE of 1.08–14.06%. The results suggested that the inheritance of quality-related traits was jointly affected by additive, epistasis and environment. In addition, 83.33% of the co-localized QTLs were shared by ADF and NDF with the same genetic direction, while the additive effect of crude protein-associated QTLs was opposite to that fiber composition on the same locus, suggesting that the loci may antagonistically contribute to protein content and fiber composition. Further analysis of a QTL related to all the three traits of fiber composition (qNDF1C, qADF1C-2, and qlignin1C-2) showed that five candidate genes were homologs of cellulose synthase-like protein A1 in Medicago truncatula, indicating the potential role in fiber synthesis. For the protein-associated loci we identified, qCP4C-1 was located in the shortest region (chr 4.3 39.3–39.4 Mb), and two of the seven corresponding genes in this region were predicted to be E3 ubiquitin-protein ligase in protein metabolism. Therefore, our results provide some reliable regions significantly associated with alfalfa quality, and identification of the key genes would facilitate marker-assisted selection for favorable alleles in breeding program of alfalfa quality improvement.


INTRODUCTION
Crude protein and fiber components are two key indicators of forage quality, which affects digestibility and nutrition intake for animal feeding. Perennial legume alfalfa (Medicago sativa L.) has become an important forage in the diet of ruminants due to its high protein content (15-20%), as well as vitamins and minerals (Annicchiarico et al., 2015). Improvement of alfalfa's protein content lowers the cost of raising of ruminants. Generally, fiber component of alfalfa is consisted of neutral detergent fiber (NDF), acid detergent fiber (ADF), and lignin. Among them, NDF is negatively correlated with feed intake of animals (Oba and Allen, 1999), and high content of ADF or lignin lowers digestibility (Sarwar et al., 1999). Therefore, breeding alfalfa new varieties with improved quality has been a major focus of the forage breeders worldwide.
Since the proposal of QTL interval mapping (IM) in 1989 (Lander and Botstein, 1989), QTL mapping has become a focus of quantitative genetics research. There have been a number of QTL studies in other crops, such as rice and soybean, for these quality-related traits (Zhang et al., 2008(Zhang et al., , 2018Warrington et al., 2015;Bazrkarkhatibani et al., 2019). For example, 47 QTLs related to fiber were identified in a corn RIL population (Li et al., 2017), and 14 QTLs related to grain protein content were mapped in a rice BC 3 F 4 population, and gluten family genes were identified from qGPC1.1 .
QTL research in alfalfa is rather slow primarily due to its nature of heterozygosity and autotetraploidy which hampers the construction of genetic maps. It has been documented that the linkage distance between molecular markers could be estimated using single dose alleles (SDA) with a separation ratio of 1:1 in alfalfa (Brouwer and Osborn, 1999). Based on SDA, using this method, a series of alfalfa genetic maps containing eight chromosomes have been constructed (Sledge et al., 2005). In 2014, a saturated genetic linkage map was developed using 3,591 single nucleotide polymorphism (SNP) markers of alfalfa . Some of the linkage maps have been applied to analyze the agriculturally important traits especially yield, fall dormancy and stress tolerance to advert environmental conditions (Ray et al., 2015;Adhikari et al., 2018;Zhang et al., 2019). Used the method described by Zhang et al. (2020), we previously also constructed an alfalfa linkage map containing 3,818 SNP markers obtained by genotyping-by-sequencing (GBS) of an F 1 population of about 392 individual plants. To identified the key QTLs or candidate genes associated with alfalfa quality, here we conducted QTL mapping based on our 3-year data of crude protein content and fiber composition of the F 1 population. Our results provide useful information on reliable candidate locus for alfalfa quality improvement.

Plant Materials and Growth Conditions
Alfalfa materials used in this study were an F 1 population of 392 individuals generated by crossing of a landrace "Cangzhou" (the paternal parent) with Zhongmu No.1 (the maternal parent) as described previously .
The field trials were conducted at the research base of the Chinese Academy of Agricultural Sciences in Langfang, Hebei province. The average temperature annually is 11.9 • C, the average temperature in the coldest month (January) is −4.7 • C, and in the hottest month (July) is 26.2 • C. The annual precipitation is 554.9 mm. The soil is medium loam with 1.69% organic matter and a pH of 7.37.

Phenotyping
Seedlings including F 1 population and the parental plants were cultured in greenhouse under conditions of 16 h day/8 h night, 22 • C, and 40% relative humidity. Clones were generated via stem cuttings and transplanted into field in early April of 2014 with three clones as replicates randomly planted in three adjacent blocks. Row spacing is 100 cm and column spacing 80 cm. Plants were clipped late fall with the ground retention height of 5 cm. No application of fertilizer or irrigation was carried out for the field.
The first cut of 2016, 2019, and 2020 with a height of ground stubble at 5 cm was used for phenotypic data collection. Samples were oven-dried (60 • C, 24 h), and ground to pass through a 1mm sieve. The content of crude protein, NDF, ADF, and lignin was measured using near-infrared reflectance spectroscopy (Foss NIRS 1650) with an analytical model corrected by chemically measured data of 20 alfalfa samples. SAS 9.4 was performed to analyze the mean comparison of the parental phenotypic data and the normal distribution test (variation range, mean, variance, skewness, and kurtosis) and correlation analysis for the crude protein, NDF, ADF, and lignin of F 1 population. The frequency distribution of the F 1 population phenotypic data was plotted using SPSS 18.0 (Baarda and van Dijkum, 2019). The broad sense heritability (H 2 ) of each trait was calculated using the ANOVA function of the software QTL-IciMapping (Meng et al., 2015).

QTL Analysis
Raw data of Genotyping-by-Sequencing (GBS) were submitted to the NCBI Sequence Read Archive with bioproject ID: PRJNA522887 1 .
The genetic linkage map was constructed as has been described by our previous study . Since our population is a pseudo-testcross F 1 population, we constructed linkage map separately in maternal and paternal parents (Grattapaglia, 1997).

Analysis of the Phenotypic Variations
For evaluation of alfalfa quality, phenotypic data including crude protein and fiber composition of F 1 population and its two parental plants were measured in 2016, 2019, and 2020. According to the t-test, the crude protein content of the parental parent was significantly higher than that of the maternal parent (21.19 vs. 18.83%; 22.32 vs. 19.27%) in 2016 and 2019 (P < 0.01). In contrast, the content of NDF, ADF, and lignin of the paternal parent was significantly lower than that in the maternal parent tested in the 3 years (P < 0.01) except lignin in 2020 (Supplementary Table 1). The results demonstrated that the paternal parent was superior to the maternal one in quality. A frequency distribution histogram based on the F 1 population quality data showed transgressive segregation in the F 1 population with a continuous distribution (Figure 1), indicating that the quality indexes used here were quantitative traits. The phenotypic variation we observed is consistent with the previous reports (Julier et al., 2000;Wang et al., 2016;Biazzi et al., 2017). The variation over years suggested that these traits are affected by the environment.

Analysis of Correlation Coefficiency and the Broad-Sense Heritability in F 1 Population
The correlation coefficient was calculated using the BLUP value. As shown in Table 1, the three indexes for fiber composition were positively correlated and the correlation coefficient ranged from 0.40-0.68 (P < 0.01) with the highest between NDF and ADF. Crude protein was negatively correlated with both ADF (−0.49) and Lignin (−0.17) for P < 0.01, but not with NDF (P > 0.05). The calculation of the broad-sense heritability showed a relatively higher value for NDF (0.56) and lower one for lignin (0.46) ( Table 1).

Mappping of QTLs Associated With Crude Protein and Fiber Composition
Based on the data of crude protein and fiber composition of the F 1 population and its parental plants collected in 2016, 2019, and 2020, QTL mapping was performed using the alfalfa linkage map we previously constructed . In total, we detected 83 QTLs related to either protein content or fiber composition with 31, 15, and 17 loci from the year of 2016, 2019, and 2020, respectively, and the rest were detected using BLUP values. For crude protein, a total of 23 QTLs were mapped including seven QTLs identified using BLUP values ( Table 2 and  Supplementary Table 2). The LOD values ranged from 2.62 to 10.24, the percentage of phenotypic variation explained by QTL (PVE) was 1. .34%, and the additive effect value of −0.87 to 1.32. QTL qCP7D, had the highest LOD value (Supplementary Table 2), and QTL located on the chromosome 5B (qCP5B) had the highest PVE and additive effect values ( Table 2).
QTLs associated with fiber composition were separately identified based on our evaluation of NDF, ADF, and lignin content. For NDF-related QTLs, the 29 loci with 12 mapped in the paternal and 17 in the maternal parent were located on 18 linkage groups with PVE varying from 2.18-6.27% (Table 2 and  Supplementary Table 2). The number of QTLs we discovered varied in the test years with a double amount mapped in 2016 compared with the 2019 and 2020. Additive effect of half the loci (14 out of 29) was positive and the other half (15 out of 29) negative, suggesting they were inherited from the maternal and paternal parent, respectively.
For QTLs associated with ADF, in total, 18 QTLs locating on 12 different linkage groups were discovered to be associated with ADF ( Table 2 and Supplementary Table 2). In 2016, 8 QTLs were mapped, and additive effect of five loci was negative with PVE of 4.06 and 4.12%. Using, BLUP values of ADF, four QTLs were detected with three from the maternal parent.
For lignin-associated QTLs, a total of 13 QTLs were discovered on 10 linkage groups (

QTLs Associated With Multiple Traits
Our results showed that 83 QTLs were detected to be associated with the content of protein or fiber in alfalfa. Among them, 12 QTLs were found to be associated with more than one trait, and seven loci were distributed on the maternal linkage map and five on the paternal one (Figure 2 and Supplementary  Figure 1). QTL qlignin6D-2, which was associated with both lignin and crude protein, has the biggest LOD and PVE ( Table 2), suggesting this QTL may contain key genes that can significantly affect lignin content.
83.33% (10 out of 12) of the QTLs were co-localized by ADF and NDF (Figure 2 and Supplementary Figure 1), indicating that both indexes contribute to fiber connect in alfalfa. Among them, three loci were identified to be associated with ADF, NDF, and lignin simultaneously, suggesting the higher reliability of these loci as candidates for further investigation of alfalfa fiber composition. The rest two co-localized QTLs were related to crude protein and NDF (Figure 2 and Supplementary Figure 1). Interestingly, the values of addition effect for individual QTL co-localized by fiber composition were unanimously negative or positive, suggesting that thy shared the same genetic direction. In contrast, the additive effect for QTLs associated with both crude protein and fiber composition were opposite, suggesting that the loci may antagonistically contribute to the two traits.

Analysis of Additive QTLs Interacted With the Environment
We also analyzed effect of environment by measuring the interaction effects. Among the 83 quality related QTLs, 47 were identified as interacting with the environment ( Table 2 and Supplementary Table 2). The additive QTL × environment interaction effect (AE) of 24 loci was positive (0.05 ∼ 0.49), suggesting an increase of the phenotypic value of the corresponding trait, and the rest 23 were negative (−0.02 ∼ −0.55). PVE of AE was ranged from QTLs were bold to indicates that this QTL co-located with others. *QTL means it had been identified previously, qCP5A-2 (Biazzi et al., 2017); qNDF8B and qADF8B (Sakiroglu and Brummer, 2017). A, additive effects; PVE(A), percentage of phenotypic variation explained by the QTL at the current position; AE, the additive QTL × environment interaction under current year; PVE(AE), percentage of phenotype variance explained by additive QTL × environment.
Frontiers in Plant Science | www.frontiersin.org 0.10 ∼ 3.84%, suggesting these interaction effects have less effect on the phenotype.

Analysis of Epistatic QTLs
To explore the interaction between QTLs, we analyzed epistasis effect. In total, 73 epistatic QTL pairs (epQTLs) were detected with PVE ranging from 1.08 to 14.06%, and 31 pairs were identified using the BLUP values. No additive QTLs were found to be affected by other QTLs. 73 epQTLs all independently affected the phenotype, and these loci were distributed on 29 linkage groups except 2B, 3B, and 5A (Supplementary Tables 3,  4, Figure 3, and Supplementary Figure 2). Eleven pairs were associated with crude protein, eight with NDF, 17 with ADF, and 37 with lignin, respectively. A total of 25 epQTLs showed positive effects (0.12 ∼ 1.62), indicated that they could increase the phenotypic value independently from additive QTL, and 37 pairs showed negative effect (−0.11 ∼ −3.06), showed that they could reduce the phenotypic value independently. ep-QTL (eqNDF1C-eqNDF5B) associated with NDF had the highest PVE of 11.18%. In addition, 43 pairs of epistatic QTLs were identified at similar positions on the same chromosome, suggesting that most qualityrelated gene regulatory factors exist nearby. Moreover, we also identified the co-localized regions of two epistatic QTLs for different traits, which showed these loci may contribute to the two traits. The co-localized regions: (1) The region between markers TP6403 and TP71156 and the region between markers TP71156 and TP57346 on chromosome 1 (linkage group 1C) were associated with crude protein and lignin.

Analysis of Potential Candidate Genes
To search candidate genes, sequences of the markers flanking our additive QTLs were used to perform BLAST against the Medicago truncatula genome. Because of the genetic difference between Medicago truncatula and Medicago sativa, limited markers were matched. Among them, the region of qNDF1C, which co-localized with qADF1C and qlignin1C, contains eight related genes, five annotated as cellulose synthase-like protein A1 (Table 3).
Consistently, a NDF-associated SNP was detected on this locus by Li et al. (2011). Genetic manipulation of the candidate genes would help to elucidate their role in alfalfa quality formation. We also referenced the alfalfa genome (Chen et al., 2020) using TBtools to do BLAST. Six QTLs were matched on six chromosomal regions and the corresponding genes within these regions were listed (Supplementary Tables 5-10). The narrowest region (qCP4C-1) encoded seven genes, and two were predicted to be E3 ubiquitin-protein ligase SDIR1 and E3 ubiquitin-protein ligase RDUF2 (Table 3), which have been proven to be closely related to the degradation of plant proteins (Holdsworth et al., 2020). More effort are needed to align the QTLs in future.

DISCUSSION
Alfalfa (Medicago sativa L.) known as the "queen of forage" has been cultivated worldwide due to its high nutritional quality as animal feed, and become the fourth most valuable field crop in the United States. Alfalfa quality determined mainly by protein content and fiber composition affects the digestibility and nutrition intake for animal feeding. Improving alfalfa quality by increasing crude protein or reducing fiber content has benefited animal husbandry (Koçer et al., 2018). Using QTL mapping combined with high throughput genotyping technology, we identified 83 regions associated with crude protein, fiber or lignin based on the phenotypic data of an F 1 population established by crossing two alfalfa parents with significant difference in quality. It has been documented that analysis of multiple effects including additive effect, epistatic effect, and QTL × environment interaction effect helps to avoid an underestimation of the total genetic impact of a trait (Patil et al., 2013;. We found that 60.26% (47 out of 83) of additive QTLs interacted with the environment, and 73 pairs regions had epistatic effect, which suggested the complexity of alfalfa quality traits. This phenomenon was also found in the QTL identification of rice grain protein content . Future research of different mapping groups with phenotypic data from multiple temporal and spatial collection would be effective.
Previous study revealed that the QTLs of NDF, ADF, and lignin in alfalfa have co-localizations on chromosomes 1 and 3 . We identified 12 QTLs associated with multiple traits and they distribute on chromosomes 1, 4, 5, 6, 7, and 8. The poor continuity of QTL mapping has been reported in crop yield studies of rice (Agrama et al., 2007;Zhang et al., 2014), probably due to the difference of populations used for the studies, as well as the number of markers available. Colocalization and pleiotropic associations have been reported to help reveal important genomic regions or genes associated with the target traits (Huang et al., 2015). Although the 12 co-localized regions scattered on six linage groups, the traits for fiber composition, including ADF, NDF, and lignin, were found unanimously sharing either negative or positive values of addition effect, indicating the similar genetic contribution of the loci to these traits. In contrast, the two QTLs co-localized by crude protein and NDF/ADF had the opposite additive effect. For example, the additive effect of a QTL associated with crude protein, NDF, and ADF in linkage group 6C, was 0.3391, −0.8079, and −0.6446, respectively, implying that the QTL antagonistically contributes to protein content and fiber composition. The findings are consistent with the correlation analysis between these traits. Supportively, down-regulation of lignin synthesis gene hydroxycinnamoyl CoA: shikimate hydroxycinnamoyl transferase in transgenic alfalfa could reduce NDF and ADF (Shadle et al., 2007). The colocalized QTLs may be applied in improvement of multiple traits simultaneously.
Four of the 83 QTLs we identified here have been documented to be associated with alfalfa quality (Li et al., 2011;Espinoza and Julier, 2013;Biazzi et al., 2017;Sakiroglu and Brummer, 2017). For example, A SNP associated with NDF in Li's report shared the same region with qNDF1C, and the QTL was co-localized by qADF1C-2 and qlignin1C-2. The shortest region of a QTL related to protein content has two E3 ubiquitin-protein ligases, which mediate the polyubiquitination of lysine and cysteine residues on target proteins, and have been proven to play an important role as regulators of protein trafficking and degradation. These findings are in supportive of our association analysis of alfalfa quality QTLs, and the loci are worthy of further investigation. Narrow down of the regions covered by our QTLs would specify more candidates potentially contributing to alfalfa quality formation. An increase of sequencing coverage, enrichment of markers and application of higher density linkage maps would facilitate marker-assisted selection for favorable alleles in breeding alfalfa varieties with improved quality.

CONCLUSION
The identified QTLs associated with quality-related traits provide important information for understanding the genetic controls of alfalfa quality. The results of this study could be used for molecular marker-assisted selection, dramatically improving the quality of alfalfa by molecular means.