Genotypic Variation in a Breeding Population of Yellow Sweet Clover (Melilotus officinalis)

Yellow sweet clover is a widely spread legume species that has potential to be used as a forage crop in Western China. However, limited information is available on the genetic variation for herbage yield, key morphological traits, and coumarin content. In this study, 40 half sib (HS) families of M. officinalis were evaluated for genotypic variation and phenotypic and genotypic correlation for the traits: LS (leaf to stem ratio), SV (spring vigor), LA (leaf area), PH (plant height), DW (herbage dry weight), SD (stem diameter), SN (stem number), Cou (coumarin content), SY (seed yield), across two locations, Yuzhong and Linze, in Western China. There was significant (P < 0.05) genotypic variation among the HS families for all traits. There was also significant (P < 0.05) genotype-by-environment interaction for the traits DW, PH, SD, SN, and SV. The estimates of HS family mean repeatability across two locations ranged from 0.32 for SN to 0.89 for LA. Pattern analysis generated four HS family groups where group 3 consisted of families with above average expression for DW and below average expression for Cou. The breeding population developed by polycrossing the selected HS families within group 3 will provide a significant breeding pool for M. officinalis cultivar development in China.


INTRODUCTION
Yellow sweet clover known as field melilot or yellow melilot, is an annual or biennial herb that belongs to the Fabaceae family. It is native to temperate and tropical Asia, and Europe (GRIN, 2000). Melilotus officinalis is one of the most common species in the Melilotus genus. This species has adaptation to environmental constraints such as drought and cold (Turkington et al., 1978) and salinity (Sherif, 2009). Melilotus is used as a ground cover in depleted soils (Allen and Allen, 1981), especially in moderately saline areas where traditional forage legumes cannot be grown (Maddaloni, 1986). Melilotus officinalis usually occurs in the northern region of China, where it is used as green manure for soil fertility improvement and also as a medicinal plant.
Species of Melilotus, including yellow sweet clover, have not been widely used in forage production due to their high coumarin content. Coumarin, a secondary plant metabolite, is associated with dicoumarol production. Dicoumarol is an anticoagulant that can cause a haemorrhagic condition known as sweet clover disease (Evans and Kearney, 2003;Nair et al., 2010). Therefore, the success of forage cultivar development based on any of the Melilotus species will depend on a combination of increasing dry matter production and decreasing coumarin content. A number of cultivars of Melilotus have been released to date; Acuma, Cumino, Denta, Polara (Smith and Gorz, 1965;Goplen, 1971) for M. albus and Norgold (Goplen, 1981), N28, N29 (Gorz et al., 1992) for M. officinalis. The Melilotus breeding program at Lanzhou University is specifically focused on the development of new cultivars with adaptation to the vast temperate grazing environments of China .
In any plant breeding program, the rate of genetic gain depends on the genetic diversity for a given trait in the breeding population (Hallauer and Miranda, 1981). Information on the magnitude of genetic variation for key plant attributes in breeding programs will enhance the development of appropriate breeding strategies to achieve maximum genetic gain (Moll and Stuber, 1974). Jahufer and Casler (2015) evaluated the relative merit in genetic gain using single trait selection, correlated response to selection and index selection, based on estimated genetic variation for a range of morphological and quality traits in switch grass (Panicum virgatum L.). Genetic variation for key traits have been reported for some of the important forage grasses and legumes: ryegrass (Breese and Hayward, 1972), tall fescue (Piano et al., 2007), white clover (Jahufer et al., 2002), alfalfa (Riday and Brummer, 2007).
There is a lack of quantitative genetic information for Melilotus. Few studies have been carried out on the genetic variation for agronomic traits in Melilotus species (Ivanov and Chetvertnykh, 1980;Sagalbekov, 1980). Nair et al. (2010) reported genotypic variation for coumarin content among 149 accessions of 15 Melilotus species. This study demonstrated the presence of potential genetic variation for coumarin content in Melilotus germplasm useful for breeding. However, breeding Melilotus species as a forage crop needs to focus on not only coumarin content but also biomass and associated traits. There is also a lack of information on the magnitude of genotypeby-environment interaction effects in Melilotus, which will be important for breeding for broad adaptation (Cooper et al., 1993b).
The objective of our study was to conduct a preliminary assessment of the performance of half sib (HS) families of Melilotus officinalis across two contrasting locations to: (a) estimate genotypic variation for key traits, and (b) identify families with a combination of superior agronomic performance and low coumarin expression in comparison to two commercial controls.

Plant Material
Six germplasm accessions (PI 552553 and PI 552554, PI 595394, PI 634019, Ames 22891, and Ames 25658) were selected from a set of 51 accessions that were evaluated for biomass production, agronomy and low coumarin, in Yuzhong, Gansu Province, during 2012-2013 (results not presented). Elite genotypes representing each of the germplasm accessions were polycrossed in isolation, using honey bees, to generate a breeding population to be used for cultivar development. A total of 40 HS families were generated by harvesting each of the genotypes individually. All harvested seeds from the individual genotypes were kept separately as individual HS families.

Field Trials
The M. officinalis HS families were established at two locations: Yuzhong (104 • 09 ′ E, 35 • 89 ′ N, elevation 1 653 m a.s.l.) and Linze (100 • 02 ′ E, 39 • 15 ′ N, elevation 1 390 m a.s.l.) in Gansu Province, China. There are different climate conditions between Yuzhong and linze. Yuzhong in Loess Plateau region is a medium temperate semi-arid climate, whereas Linze in the Hexi Corridor is typical desert climate and characterized by an arid climate (Su et al., 2007;Hu et al., 2012;Li et al., 2014). The average annual precipitation in Yuzhong is 295 mm and in Linze is only 117 mm. The mean monthly minimum and maximum temperatures, and total monthly rainfall during the trial period at the two locations are shown in Figure 1.
The soil type at each location is loessal soil at Yuzhong and meadow soil at Linze. The saline-alkali degree was much higher in Linze than in Yuzhong, the salinity is 1.8 ppt in Linze and 0.5 ppt in Yuzhong. Initial soil conditions in Yuzhong and Linze are: pH 7.0 and 7.5, total N of 0.756 g/kg and 0.803 g/kg, total P of 0.752 and 0.708 g/kg, respectively.
At each location, the experimental plots were arranged in a randomized complete block design containing three replicates. Each replicate consisted of the 40 HS families the six parental germplasm accessions and two commercial checks. The origins of these entries are provided in Table 1. The two trials were sown in 15-18 June 2014. The experimental plot size for each entry was 2.4 m 2 (0.8 × 3 m). Within each plot, the seed was planted at a spacing of 30 cm within-rows and 60 cm between-rows. The plots were fertilized with 150 kg (NH 4 ) 2 HPO 4 ha −1 after sowing.
Visual scoring for SV was based on a scale of 1 to 5 (1 = low; 5 = high). The morphological traits (PH, SN, SD, and LA) were measured at the flowering stage (50% of the plants had open flowers), resulting in a minimum of three individuals per replicate. LA was measured from three middle leaflets per plant by using a flatbed scanner (EPSON GT-15000) and a WinSEEDLE 2011 image analysis system (Regent Instruments Inc.). Individual plant was harvested for DW measurement at the flowering stage after measuring morphological traits. At harvest, three randomly sampled plants from each replicate were cut off at 3 cm above the soil, placed in paper bags and dried at room temperature (about 20-25 • C) with good ventilation until no change in weight was recorded. After measuring DW, the dried samples were hand separated into leaf blade and stem (including the inflorescence and leaf sheath) components and weighted to determine the LS ratio. Three sub samples from each FIGURE 1 | Mean monthly maximum and minimum temperatures ( • C) and total monthly rainfall (mm) at Yuzhong (A) and Linze (B), respectively. field replicate at Yuzhong were combined and ground in a mill to pass through a 1 mm screen for Cou determination. SY was determined from two randomly sampled individuals taken from each replicate when 90% of the pods turned blackish brown at the Linze field. Cou was quantified using HPLC (high performance liquid chromatography, Agilent 1100 series) with a mobile phase of methanol-water (65:35) through an Agilent-XDB C18 column (Zhu and Fan, 2008).

Analysis of Variance
The data were analyzed within and across the two locations Yuzhong and Linze. The analysis across locations was conducted: (a) on only the 40 HS families to estimate genotypic variation, and (b) using all entries in the trial that consisted of the 40 HS families, the six parental germplasm accessions and two check cultivars, which enabled comparison of progeny, parents, and the commercial material. The analysis was conducted using the variance component analysis procedure, Residual Maximum Likelihood (REML) option, in GenStat 7.1 (2003). A mixed linear model was used for the analyses across the two locations using the REML algorithm.
The linear model used in the analysis was, Where, Y ijk is the value of an attribute measured from HS family i in replicate k in location j, and I = 1,...,n g , j = 1,...,n l , k = 1,...,n r ; M is the overall mean; g i is the random genotypic effect of HS family i, N(0,σ 2 g ); l j is the fixed effect of location j, N(0,σ 2 l ); r jk is the random effect of replicate k within location j, N(0,σ 2 b ); (gl) ij is the effect between HS family i and environment j, N(0,σ 2 gl ); ε ijk is the residual effect for HS family i in replicate k in location j, N(0,σ 2 ε ). The mixed model analysis generated HS family means based on Best Linear Unbiased Predictors (BLUP) (White and Hodge, 1989). These BLUP values were used to construct a HS family × trait mean matrix adjusted of HS family × location interaction effects.

Genotypic Variation and Repeatability
Variation among HS families generated from a population that has gone through at least two cycles of random mating, is an estimate of ¼ additive variation of the random mating population they represent (Falconer, 1989). In our study, the 40 HS families were a result of the first random mating of selected germplasm and therefore represented only the F1 generation. Therefore, we do not refer to the variation estimated among the 40 HS families as ¼additive variation, but as genotypic variation, due to a possible combination of additive and non-additive effects. The genotypic variation for the different traits enabled calculation of repeatability, an estimation of the upper limits of their degrees of genetic determination (Falconer, 1989).
The genotypic variance components generated from the REML analysis within and across locations were used to calculate repeatability (R) (Fehr, 1987).
HS family mean repeatability at a single site: HS family mean repeatability across locations: Where, in both model (2) and model (3), the respective variance components and their divisors are defined in relation to linear model (1).

Phenotypic and Genotypic Correlation
Phenotypic correlation (r p ) analysis was carried out using GenStat 7.1 (2003). The multivariate MANOVA procedure, within GenStat 7.1 (2003), enabled estimation sums of crossproducts, using the multisite trait data from the 40 HS families. Mean cross products were then calculated and resolved to estimate genotypic covariance components. The genotypic covariance components were used together with the σ 2 g estimates, from REML analysis, to determine genotypic correlation coefficients (r g ) according to Falconer (1989).

Pattern Analysis
Pattern analysis was conducted to: (a) provide a graphical summary of the performance of the 40 HS families, six parental germplasm accessions and the two check cultivars of M. officinalis, based on the genotype × trait BLUP adjusted mean matrix generated from variance component analysis across the two locations Yuzhong and Linze, and (b) investigate any changes in type (positive or negative) and magnitude of the association among the seven traits across Yuzhong and Linze. Pattern analysis consisted of a combination of cluster and principal component analysis (PCA) (Gabriel, 1971;Kroonenberg, 1994;Watson et al., 1995). To identify the optimum level of truncation for the resulting hierarchy from cluster analysis, the increase in the sum of squares among accession groups was monitored as the number of groups increased. The group level selected was determined by the point where the percentage of accession sum of squares among groups did not improve substantially as the number of groups increased (DeLacy, 1981).

Genotypic Variance Components and HS Family Mean Repeatability of Plant Attributes of M. officinalis
The genotypic variance estimated for the different traits from the individual location, Yuzhong and Linze, analysis indicated significant (P < 0.05) variation among the 40 M. officinalis HS families (Tables 2A,B). At both these locations HS family mean repeatability estimates ranged from intermediate to very high, depending on the traits. At Yuzhong, the HS family mean repeatability (R 1 ) was high for the traits DW, SD and SV, which ranged from 0.89 to 0.96 ( Analysis of variance for mean trait expression across the two sites Yuzhong and Linze indicated significant (P < 0.05) genotypic variation among the 40 HS families. There was also significant (P < 0.05) genotype-by-location interaction, depending on the traits (Table 3). There was no significant (P > 0.05) genotype-by-location interaction for the traits LS and LA. Line mean repeatability (R 2 ) across the two locations varied from: relatively high for the traits DW and LA; intermediate for PH, LS, SD, and SV; and low for SN.

Pattern Analysis: Principal Component Analysis (PCA)
The biplot (Figure 2) was generated from PCA of the 40 HS families, the six parental germplasm accessions and the two check cultivars of M. officinalis, based on the 9 traits LS, SV, LA, PH, DW, SD, SN, and Cou. The first principle component explained 46% of the total trait variation, and the second principle component explained 18%. The correlation structure of the traits is indicated by the directional vectors in the biplot. In this study, SD, SN, and PH showed a strong positive association with DW. The traits LS and Cou also showed a negative correlation with DW. The seven plant trait responses at the locations Yuzhong and Linze are presented in the two biplots, (Figures 3A,B). In Figure 3A, based on breeding line performance at Yuzhong, the first and second principal components accounted for 43 and 19% of the total variation, respectively. Based on the line performance at Linze, the first principle component explained 51% of the total FIGURE 2 | Biplot generated using standardized Best Linear Unbiased Predictor values for eight traits measured from: the 40 half sib families, the 6 parental germplasm accessions and the 2 check cultivars of M. officinalis, evaluated across two locations Yuzhong and Linze. Components I and II account for 46 and 18% of total variation, respectively. The different symbols indicate progeny Groups 1 to 4 generated from cluster analysis. The vectors represent the traits: LS, leaf to stem ratio; LA, leaf area; PH, plant height; DW, herbage dry weight; SD, stem diameter; SN, stem number; SY, seed yield; Cou, coumarin. The 6 parental germplasm accessions: P1 to P6. Check's: CH1, experimental cultivar; CH2, cv Norgold. trait variation, and the second principle component explained 23% ( Figure 3B).
There were differences in trait association across the two locations Yuzhong and Linze. The traits DW, SD, PH, SN, and LA showed a strong positive correlation at Yuzhong (angles between the directional vectors are at <45 • ). At Linze, DW was positively correlated with SD and PH similar to that showed in Yuzhong. However, SN and LA showed a weak positive associated with DW (Figures 3A,B).

Cluster Analysis
Clustering of the 40 HS families, together with the 6 parental germplasm accessions and 2 check cultivars, was truncated at the four group level. Group 4, the largest group contained 17 members, followed by group 1, group 3 and group 2, which contained 14, 11, and 6 members, respectively ( Table 4). As indicated by the Figure 2, the check cultivars were both in group 1. The parental germplasm accessions P1, P2, P3 and P4, P5, P6 were in groups 3 and 1, respectively. The trait means for each group (Table 4) indicated that the members in group 3 had high DW and low coumarin content, and those in group 1 had low coumarin content and intermediate expression for traits DW, PH, SD, and SN. The members in group 4 showed characteristics of a small plant type with high coumarin content. The highest expression for coumarin was in group 2. Groups 3 and 1 had higher SY expression in comparison to groups 4 and 2.

Phenotypic and Genotypic Correlation
A range of genotypic and phenotypic correlation coefficients are presented in Table 5. These coefficients range from strong to weak positive and negative pairwise associations among the 7 traits. Of the special interest are the phenotypic and genotypic correlations between DW and the other traits. There was strong positive phenotypic correlation between DW and the traits SD, PH and SN, and strong negative phenotypic correlation with LS and SV. These results are further supported by the directional vectors in the biplots (Figures 2, 3A,B). In comparison to phenotypic correlation, the estimated genotypic  correlation coefficients for all 7 traits showed similar types of pairwise association (Table 5).

DISCUSSION
Previous studies on genotypic variation within Melilotus spp. have mainly focused on interspecific comparisons for traits such as coumarin content (Nair et al., 2010), salinity, waterlogging tolerance (Rogers et al., 2008), and also on phylogenic relationships (Di et al., 2015) and genetic diversity Wu et al., 2016). The significant (P < 0.05) genotypic variation and high to moderate line mean repeatability reported from our study, indicates the potential for genetic improvement of the nine traits examined. There are no reported studies in M. officinalis similar to ours that estimate the magnitude of genotypic variation for key traits such as DW, Cou, PH, and SY. Phenotypic variation, expressed as ranges, has been reported for some morphological traits. Klebesadel (1992) reported 2 year means of PH of M. officinalis ecotypes ranging from 112 to 145 cm. Second year mean plant height (PH) measured in our study ranged from 144 to 188 cm. Martino et al. (2006) reported a range of coumarin content between 0.12 and 0.39% based on different extraction methods. Nair et al. (2010) reported coumarin content measured from 27 M. officinalis accessions ranging from 0.09 to 0.61% of dry matter. Our study indicated a coumarin content that ranged from 0.04 to 0.91% of dry matter. Herbage dry matter from single plants has been reported from experiments conducted under glasshouse conditions (Rogers et al., 2008). There is a lack of information on morphological traits measured under field conditions. Results from our study on the genotypic variation for the traits LS, SD, SN, LA, SV and SY, measured under field conditions, will be valuable to Melilotus breeders. Information on the magnitude and significance of the genotypic and environmental components of phenotypic variation for important traits will provide a basis for the development of efficient breeding methods for their improvement (Moll and Stuber, 1974). Results from the present study showed that there was significant genotypic variation among the 40 HS families at each location, Yuzhong and Linz, and also across these two locations for all the traits measured. High genotypic variation was present for DW, SV, and SD at Yuzhong and LA, SD, DW, and SV at Linze. These results, together with the relatively high HS family mean repeatabilities estimated, indicate the potential genetic variation available, within the new M. officinalis breeding population, for improvement of these traits through selection and breeding.
Forage plants are utilized across a wide range of environments, which include different climates, soil types and grazing systems (Breese, 1969). The presence of genotype-by-environment interactions complicates selection of material for broad adaptation due to unreliable performance across environments (Comstock and Moll, 1963;Cooper and Byth, 1996). Quantifying the magnitude and understanding the causes of genotypeby-environment interaction can be helpful when planning breeding strategies (Milligan et al., 1990;Basford and Cooper, 1998). Caradus (1993) reported that a range of traits in white clover, especially yield-related traits, were sensitive to genotype-by-environment interactions. A similar result in white clover was reported by Jahufer et al. (1999). In our study, the genotype-by-environment interactions were significant for most traits except for the traits LS and LA. This indicates the importance of multi-site evaluation in M. officinalis breeding programs when focusing on broad adaptation. The application of multisite testing in breeding programs to investigate the effect of genotype-by-environment interaction on line performance has been reported for forage grass and legume species such as perennial ryegrass (Easton et al., 2015), switchgrass (Jahufer and Casler, 2015), alfalfa (Hill and Baylor, 1983), and white clover (Ballizany et al., 2012).
The association among the traits measured in our study was examined using a combination of phenotypic and genotypic correlation with pattern analysis. The estimates of phenotypic and genotypic correlation coefficients supported the association among traits indicated in the biplots. The positive and significant phenotypic association of DW with traits PH, SD, and SN, predicts a positive correlated response in all these traits when any one of them is selected for individual. This relationship will be useful in a breeding program. The strong positive correlation between DW and SY shown in the biplot (Figure 2) indicates that selection for herbage yield would also result in increasing seed yield. Significant correlation of forage yield and seed yield was also demonstrated in other legumes (Iannucci and Martiniello, 1998;Guler et al., 2001;Cakmakci et al., 2006). Our study indicated negative phenotypic and genotypic correlation between DW and LS. The LS is used as an indicator of digestibility and intake in forage (Kephart et al., 1990). This result implies a tradeoff between herbage yield and quality. Julier et al. (2000) also estimated significant negative correlation between DW and LS in alfalfa, which is similar to M. officinalis in vegetative form (Whitson et al., 1992).
The strong negative relationship between SV and DW suggests that measurement of spring vigor, at a very early stage of plant growth, could serve as an indirect selection criterion for increasing herbage yield for M. offcinalis grown in western China ( Table 5). This will increase the efficiency of current breeding methods, especially when dealing with the biennial forage specie like M. officinalis. Similar results were reported from studies on common vetch (Cakmakci et al., 2006). The negative phenotypic correlation between the traits DW and Cou shown in our study (Figure 2) indicates the possibility of identifying HS families with a combination of high herbage dry weight and low coumarin content expression. This association will be of significant importance in our M. officinalis breeding program. Hofmann and Jahufer (2011) showed negative association between flavonoid accumulation and biomass using multivariate analysis.
Pattern analysis has been successfully used to summarize complex genotype-by-environment (Cooper et al., 1993a;Zhang et al., 2006) and genotype-by-trait (Jahufer et al., 1999;Davodi et al., 2011) data matrices. Jahufer et al. (1999 successfully identified superior white clover full-sib families based on seven morphological traits using a combination of principle component and cluster analysis. Davodi et al. (2011) used pattern analysis to summarize the performance of 200 alfalfa germplasm accessions, based on 12 traits, for use in the improvement of yield and quality. In our study, pattern analysis generated four groups (Figure 3), where group 3 consisted of HS families with above average performance for DW and below average performance for Cou. Group 3 consisted of 11 members, which included the parental germplasm accessions P1, P2, and P3. All the HS families in group 3 had a higher expression of the traits DW, SD, and SY in comparison to both commercial checks. The breeding lines in group 3 will be polycrossed to produce a breeding population that will be used in the recurrent selection program to develop new cultivars of M. officinalis with high herbage yield and low coumarin content for the Loess Plateau region in China.

CONCLUSION
The estimates of genotypic variation and HS family mean repeatability indicate the potential genetic variation available for all the traits examined in our study. These estimates also indicate the potential to develop cultivars with increased forage yield and low coumarin content. The significant genotypeby-environment interaction estimated for the traits DW, PH, SD, SN, and SV across the two environments, Yuzhong and Linze, indicate the importance of multi-environment evaluation trials in our M. officinalis breeding program. The breeding population developed by polycrossing the HS families within group 3, identified using pattern analysis, will provide a significant breeding pool for M. officinalis cultivar development in China.

AUTHOR CONTRIBUTIONS
KL, MJ, JZ, and YW conceived the topic. KL, FW, HD, and XM performed the experiments. KL and MJ analyzed all statistical data. KL wrote the manuscript. All authors revised the manuscript. We thank National Plant Germplasm System (NPGS) for offering the Melilotus officinalis seeds.