Unraveling the Relationship Between Seed Yield and Yield-Related Traits in a Diversity Panel of Brassica juncea Using Multi-Traits Mixed Model

The response to selection in any crop improvement program depends on the degree of variance and heritability. The objective of the current study was to explain variance and heritability components in Indian mustard Brassica juncea (L). Czern & Coss to recognize promising genotypes for effective breeding. Two hundred and eighty-nine diverse accessions of Indian mustard belonging to four continents were analyzed for yield and yield-related traits (20 traits) over two seasons (2017–2018 and 2018–2019) using an alpha lattice design. The genetic variance was found to be significant (P ≤ 0.01) for the individual and under pooled analysis for all of the evaluated traits, demonstrating the presence of significant genetic variability in the diversity panel, which bids greater opportunities for utilizing these traits in future breeding programs. High heritability combined with high genetic advance as percent of mean and genotypic coefficient of variation was observed for flowering traits, plant height traits, seed size, and seed yield/plant; hence, a better genetic gain is expected upon the selection of these traits over subsequent generations. Both correlation and stepwise regression analysis indicated that the main shoot length, biological yield, total seed yield, plant height up to the first primary branch, seed size, total siliqua count, days to flowering initiation, plant height at maturity, siliquae on the main shoot, main shoot length, and siliqua length were the most significant contributory traits for seed yield/plant. Also, promising genotypes were identified among the diversity panel, which can be utilized as a donor to improve Indian mustard further. These results indicated a greater scope for improving seed yield per plant directly through a selection of genotypes having the parsimonious combination of these nine traits.


INTRODUCTION
Brassica juncea (L.) Czern & Coss (AABB) is the second most important edible oilseed crop in India after the soybean. B. juncea is a natural allotetraploid of two diploid species viz., Brassica rapa (AA) and Brassica nigra (BB). Rapeseed is a key species from the Brassica genus and a high-value crop for oil and biofuel industries. In India, during 2018-2019, rapeseed-mustard was cultivated over an area of 5.96 million hectares with production and productivity of 8.32 million tons and 1,397 Kg/ha, respectively (Directorate of Economics & Statistics, and Dac&Fw., 2019). Globally, India is the second-largest rapeseed-mustard-cultivating country after China and third in production next to China and Canada (Kumari et al., 2019). In addition to its use as edible oil, mustard oil has a spectrum of industrial utilities such as paint and printing ink additives, greases and lubricants, resins and polymers, plastics, cosmetics, and also in the pharmaceutical industries (Gupta, 2016).
For initiating any genetic improvement program, genetic variability is the prime criterion. Genetic parameters aid in recognizing the gene action and components of genetic variance identification and also facilitate the selection of an appropriate breeding technique. The genotypic and phenotypic variances generally influence the heritability and environmental factors (Bisne et al., 2009). Therefore, the information about heritability and predictability of genetic gains and the association between seed yield and yield-related traits in the base germplasm collection is vital for any genetic improvement program.
Yield is a complex trait and is greatly influenced by environmental factors. Hence, the selection of superior genotypes among a large set of genotypes based on their arithmetic mean performance may not be accurate (Piepho et al., 2008). In this context, the best linear unbiased prediction (BLUP) can provide a good predictive accuracy compared with other procedures for estimating the random effects due to genotypes in a mixed model. It gives a good fit for phenotypic effects to the nongenetic effects by the shrinkage effect toward the probable genetic values. Shrinkage effects by this model anticipate the regression to the mean observed in the selected genotypes, and the individuals having extremely high or low performance consequently got adjusted, thereby improving the accuracy of genotypic effects (Molenaar et al., 2018). Hu (2015) proved that BLUP was effective for calculating genetic parameters and predicting genotypic values and concluded that it could be applied in genetic improvement programs for rapeseed-mustard.
The yield of a crop is directly or indirectly influenced by various yield-contributing traits such as seed size, primary and secondary branches per plant, length of siliqua, seeds per siliqua, etc. Hence, plant breeders often focus on the selection of such traits in combination, each of which was assigned to have a certain level of economic weight based on their importance toward seed yield to form a selection index (Smith, 1936;Hazel, 1943). Multivariate analysis methods, such as genetic correlation analysis, stepwise multiple regression analyses, and path analysis, have been utilized in several crops, including mustard, to identify the causal traits having either direct or indirect effect on seed yield (Olivoto et al., 2016). The path coefficient analysis provides accurate information about the relationship of direct and indirect effects of variables by splitting the correlation coefficients. Therefore, the contribution of each character to yield could be assessed for selecting appropriate traits for indirect selection in any breeding technique (Rao et al., 2013). In contrast, studies using mixed models and sequential path analysis to identify the relationship of cause and effect considering genotypic values in B. juncea are still very scarce.
In this context, the current study was carried out with the objectives (i) to use restricted maximum likelihood/BLUPbased method to assess variance, genetic parameters, and genotypic performance of mustard genotypes in multiyear trials, (ii) to fit stepwise regression model for identifying highly significant traits to form a path diagram that explains the relationship of cause and effect among seed yield-related traits, and (iii) to group the germplasm lines sharing a common attribute based on Mahalanobis distance. This study unravels the nature of genetic variability in Indian mustard and would be helpful in the selection of superior genotypes for yield and related traits, which further augment the ongoing and future mustard breeding programs.

Source of Germplasm and Experimental Location
Two hundred and eighty-nine diverse accessions of B. juncea germplasm obtained from Punjab Agricultural University, Ludhiana, under Indian Council of Agricultural Research (ICAR)-National Agricultural Science Fund-sponsored project, including varieties, cultivars, introgression lines, derived lines, and exotic and indigenous collections from the diverse origin such as from India, Australia, Europe, Germany, and Canada were evaluated for phenological and morphological traits under timely sown irrigated conditions. The details of the germplasm accessions used are presented in Supplementary Table 1

Experimental Layout and Observation Recording
Trials were laid out in a randomized alpha lattice design with two replications; each plot consisted of four rows of 2-m length in a plot size of 2.4 m 2 . Five representative plants from each treatment were selected from the middle two rows for evaluation of agronomic performances. Recommended agronomic practices were followed in both seasons. Data of five characters viz. days to flowering initiation (DFI), days to 50% flowering (DFF), days to 100% flowering (DHF), days to flowering completion (DCF), and days to maturity (DMT) were recorded on a plot basis. Morphological data of 15 characters recorded on five plants each include plant height at flowering (PH_Fl), plant height up to the first primary branch (PH_FPB), plant height at maturity (PH_M), number of primary branches (PB), number of secondary branches (SB), main shoot length (MSL), siliquae on the main shoot (SMS), total siliquae count (TSC), siliqua length (SL), seeds per siliqua (SPS), seed size (SS), seed yield/plant (SY/Plant), total seed yield/plot (TSY/Plot), biological yield/plot (BY/Plot), and harvest index (HI).

Analysis of Variance
For each given trait, plot-level averages of both seasons were taken as the response variable in an iterative mixed linear model fitting procedure by the full model (Eq. 1) in lme4 R-package (Bates and Maechler, 2009). The best-fit model for each agronomic trait was attained by removing all random terms from the model that were not significant at α = 0.05 in a likelihood ratio test (Littell et al., 2006). Three variance components (σ 2 g, σ 2 gy, and σ 2 e) for each of the 20 traits were calculated using the restricted maximum likelihood (Patterson and Thompson, 1971) estimation method. In the current study, the year was fitted as fixed effect, and genotypes, blocks, replications, and genotype relationship with year were fitted as random effects. The phenotypic results z ijkl on accession m in replication k of block l and year i was displayed as: where µ is the grand mean; y i is the fixed effect of year i; g j is the random effect of genotype, j and is ∼NID(0, σ 2 g ); r ik is the random effect of replication, k in year i and is ∼NID(0, σ 2 r ); b ikl is the random effect of block l nested with replication k in year i and is ∼ NID(0, σ 2 b ); (gy) ij is the random effect of the relations between genotype j and year i and is ∼NID(0, σ 2 gy ); and ε ijkl is random residual effect and ∼ NID(0, σ 2 ε ). Diagnostic residual plots were used to check the normality and homogeneity of the response variable. If the residuals from the fitted model did not meet the assumptions, data were subjected to transformation. This final model was utilized to generate the BLUP for each genotype.

Estimation of Heritability and Genetic Parameters
The heritability parameter across the year was estimated by analysis of variance using the ratios of Hallauer and Miranda (1988). The genetic advance was estimated for traits using the formula given by Johnson et al. (1959). The genetic advance as percentage of mean was assessed, as defined by Souza et al. (2009). The phenotypic (PCV) and genotypic coefficients of variation (GCV) were calculated, according to Singh and Chaudhary (2004). Pearson's correlation coefficient (r) between BLUP values for each pair of genotypic traits was estimated using the "corrr" package (Version 0.4.3; Ruiz et al., 2019) in R version 3.5.1 (R Core Team, 2018). To identify the most influential agronomic traits with respect to seed yield/plant as a dependent variable, a stepwise regression model was fitted. The independent variables with the highest share in explaining the variations of the dependent variable were recognized using PAST version 3.09 software. The significance level of a term in the regression model was 5%. R-programs-"agricolae" (de Mendiburu and de Mendiburu, 2019), "Hmisc" (Harrell and Harrell, 2019), and "diagram" (Soetaert, 2009) packages were utilized for path analysis. The divergence was estimated based on the predicted mean (BLUP s ) values of 20 characters pooled over the years, and the residual variance-covariance matrix generated using vcov function of lme4 package was subjected to grouping using the D 2 statistic according to Mahalanobis (1936) and extended by Rao (1952). Mahalanobis's distance matrix thus obtained was further subjected to clustering by Ward2 hierarchical agglomerative clustering method (Murtagh and Legendre, 2014). R statistical software packages such as "biotools" (da Silva and da Silva, 2017), "dendextend" (Galili, 2015), "circlize" (Gu et al., 2014), "plotrix" (Lemon et al., 2015), "qgraph" (Epskamp et al., 2017), and "car" (Fox et al., 2007) were used for divergence studies based on 20 different agro-morphological traits.

RESULTS
Meteorological observations recorded during the season (October-March) are represented in Figures 1A,B. All weather parameters except rainfall were recorded as means over the crop growing period, October-March. Rainfall was recorded as cumulative rainfall received during the period. The average maximum and minimum temperatures were 25.6 and 23.8 • C, and 8.9 and 9.2 • C during the 2017-2018 and 2018-2019 crop seasons, respectively. During 2017-2018, rainfall and sunshine hours were 6.0 mm and 5.5 h, respectively, whereas, during the 2018-2019 crop season, the average rainfall and sunshine hours were 138.4 mm and 4.8 h, respectively.

Mean Performance and Variation for Phenotypic Traits
Extensive phenotypic variation was observed for the seed yield and related components under normal sown conditions during both seasons. A large range of variation was observed for most of the traits under study, with the coefficients of variation (CV) ranging from 1.80% for days to maturity to 30.0% for harvest index under normal sown conditions. High CV for some of the yield traits such as TSC, SY/P, TSY/Plot, BY/Plot, and HI was due to longer duration and photoperiod sensitivity of some Canadian accessions-CN-105305, CN-34005, and CN-34008. Most of the traits were approximately normally distributed apart from flowering traits and PH_FPB (Figure 2A). Also, the acceptable level for each trait was indicated by green color on histogram based on the ideotypic concept in mustard given by Bhargava et al. (1984); Thurling (1991), VijayaKumar et al. (1996), Yadav et al. (2017, and DUS guidelines given by Protection of Plant Variety and Farmers Right Authority of India. The boxplots obtained between seasons for each trait were compared using Wilcoxon statistic, and the corresponding level of significance was shown by p-values in figures. The analysis indicated a significant mean difference between seasons for each trait. The mean values of all traits except siliquae on the main shoot and harvest index were slightly higher in the season 2017-2018 than 2018-2019 ( Figure 2B). The mean performance of SY/P during 2017-2018 was 20.38 ± 0.4 g, whereas, during 2018-2019, it was 15.38 ± 0.3 g. Based on the mean performance, IC-597867 yielded the highest seed yield/plant of 78.5 g per plant, and CN-34005 has no yield during 2017-2018. Similarly, IC-597867 remains the highest yielder of 59.5 g per plant, whereas CN-105364 yielded the lowest of 1 g/per plant. The best performing genotype across the year was IC-597867, whereas CN-105364 was the least performer. An overview of the agro-morphological traits recorded pooled over 2 years is shown in Table 1.
Significant effects of the sources of variation are presented in Table 2. The effect of genotypes was significant (P ≤ 0.01) for all the studied traits for both years, representing the presence of considerable genetic variability in the germplasm for all the traits. Significant G × E interactions were observed for all the traits during pooled analysis,  suggesting different genotypes response to environmental conditions differentially.

Estimation of Heritability and Genetic Parameters
The phenotypic component of the variance was divided into genotypic variance (σ 2 g ), G × E variance (σ 2 gy ), and error variance (σ 2 e ). Furthermore, genotypic and G × E variances were compared with total phenotypic variance to identify the magnitude of genotypic contribution for Brassica improvement. In the present study, most of the traits were highly heritable (>0.60) as per the scale of Robinson (1966) in combined environments except SB and HI, which showed low heritability (>0.30). Similarly, PB, TSC, SPS, TSY/Plot, and BY/Plot showed moderate heritability ( Table 2). The estimates of broad-sense heritability in pooled data ranged from 0.22 (SB) to 0.86 (DCF). Traits such as flowering, plant height-related, siliquae on the main shoot, siliqua length, seed size, and seed yield/plant were found to be more heritable. For seed yield/plant, the heritability was high in first year (2017-2018) compared with second year (2018-2019), whereas combined analysis resulted in the lowest value for heritability, indicating the significant partitioning of G × E variance from genetic variance for these traits obtained in the individual environment.
The PCV and GCV, genetic advance, and GA as % mean were calculated along with heritability for all the traits ( Table 2). The highest GCV and PCV were observed for plant height up to the first primary branch (46.43 and 64.38%, respectively), and the lowest GCV and PCV were recorded for days to maturity (4.09 and 5.01%, respectively). Results suggested a narrow difference between GCV and PCV for highly heritable phenological traits such as DFI, DFF, DHF, DCF, DMT, PH_Fl, PH_M, MSL, etc., which can be improved directly by selecting genotypes having a higher trait value. However, the difference is more prominent for traits with low heritability, such as PB, SB, TS, BY, SY/P, and TSY/Plot, indicating the significant influence of environment and G × E interactions. In all these traits, PCV was considerably higher than GCV. The genetic advance as a percentage of the mean ranged from 8.6% in DMT to 94.0% in PH_FPB. The results showed that selecting the top 5% of the genotypes could result in genetic improvement of 51.6% for DFI, 50.4% for DFF, and so on ( Table 2). The current study found that among the parameters under study, high heritability (≥60%), genetic advance as percentage mean (>20%), PCV (>20%), and GCV (>20%) were observed in characters such as days to flowering initiation, days to 50% flowering, days to 100% flowering, plant height at flowering, plant height up to the first primary branch, seed size, and seed yield/plant. High PCV, GCV, and genetic advance values with low and moderate heritability were observed for total seed yield/plot, biological yield/plot, and harvest index.

Association Among Traits and Their Contribution Toward Seed Yield per Plant
The utility of independent traits in the selection can be expected by their significant association with seed yield (dependent trait). In the present study, genotypic correlations between 20 character pairs were studied in all possible combinations (Figure 3). The prime economic trait, seed yield per plant, TABLE 2 | Genotypic variance (σ 2 g ), genotypic-year interaction variance (σ 2 gy ), standard error (SE), and heritability in broad-sense (H 2 ) estimated over each season and pooled data along with pooled mean values, genetic and phenotypic coefficients of variation (GCV and PCV), genetic advance (GA), and genetic advance as percentage of mean for the 20 traits studied.   identified BY/Plot, TSY/Plot, PH_FPB, SS, TSC, DFI, PH_M, SMS, and SL as highly significant traits. However, keeping seed yield per plant as a dependent variable, these traits with compounded effect explained approximately 46% of total variance with an R 2 value of 0.46 ( Table 3).
The path analysis (Table 4) showed that the total seed yield/plot had the highest positive direct effect (ρ X1 = 0.242) followed by siliquae on the main shoot (ρ X4 = 0.225) and seed size (ρ X5 = 0.215). However, this trait had an important negative direct effect through plant height up to the first primary branch (ρ X2 = −0.185) and plant height at maturity (ρ X9 = −0.150). Siliqua length and biological yield/plot showed a positive indirect effect on yield (ρ = 0.216 and ρ = 0.203, respectively). However, days to flowering initiation (ρ = −0.350) and plant height up to the first primary branch (ρ = −0.235) contributed with negative indirect effects on seed yield. The present study also entrusted to identify promising genotypes for significantly associated traits identified by regression and path analysis ( Table 5). Some of the genotypes showed superior performance for more than one trait, viz. IC-597867 is found to be the potential donor for traits such as basal branching, total siliquae count, and seed yield per plant. Similarly, PBR-210 and RE-7-1 were found to be superior for total seed yield per plot and seed yield per plant. These trait donors can be further utilized for developing better genotypes through systematic hybridization.

Diversity Analysis
Cluster analysis based on phenotypic data revealed that genotypes were distributed into five different clusters (I to V) at Mahalanobis D 2 value of 60.0 ( Figure 4A). D 2 distance reported in the present study ranges from 0 to 351.38, also indicated huge genetic diversity in the population. Cluster I had 281 genotypes followed by clusters II, III (three genotypes each), IV, and V (one genotype each). Again, cluster I was found to have around 26 subclusters (a to z) at Mahalanobis D 2 value of 20.0 (Figure 4 and Supplementary  (2), and subcluster I (d-i) is dominated by high-yielding cultivars and advanced breeding lines of Indian origin. Most of the indigenous collection (IC) based on B. juncea var. rugosa and Canadian gene pool got together in subcluster I (j-z) ( Figure 4B). The relative contribution of each character indicated that seed size (13.14%), days to 50% flowering (10.85%), and siliqua length (7.88%) contributed maximum toward total divergence ( Table 6). All the popular Indian cultivars got included in cluster I. Furthermore, it was explained that cluster I contained accessions producing the highest total seed yield per plot and seeds per siliquae, which are early and require a minimum for flowering traits. Genotypes belonging to cluster II were found to have a longer duration for flowering traits with higher harvest index and lowest values for several secondary branches, main shoot length, and total siliquae count. Cluster III is found to have genotypes having the higher plant height and biological yield per plot, whereas lower values for SL, SPS, and HI. Cluster IV contained a single genotype (IC-597867) having the highest seed yield per plant with higher MSL, SPS, and TSC with smaller seed size and shorter plant height up to the first primary branch. Similarly, cluster V contained a genotype (CN-34005) having maximum values for SS and SL and minimum values for PH_Fl, PH_M, PB, SY/P, TSY/Plot, and BY/Plot.
Although genotypes in cluster I were from the different geographical locations, genotypes belonging to the same location or same breeding program that tend to fell together in the same subclusters such as DRMRIJ-17 series, i.e., 17-41,17-42, 17-43, and 17-52 derived from ICAR-Directorate of Rapeseed and Mustard Research, Bharatpur, fell on subcluster Id of cluster I. Similarly, the indigenous collections from Arunachal Pradesh, such as IC-597870, IC-597871, and IC-597881 (subcluster Io) and IC-597904 and IC-597949 (subcluster In), and most of the Canadian gene pool from subclusters I (t-y) followed the same pattern as the DRMRIJ series. In contradictions to the observation discussed earlier, a few accessions collected from the same region, such as Canadian germplasms, did not fell in a single cluster or subcluster, indicating that geographical proximity does not always result in genetic similarity. Based on the similarity matrix, the distance between clusters, inter-cluster D-values ranged from 8.30 to 15.68 (Figure 5). The higher inter-cluster distances than the intra-cluster distances designated wider genetic diversity of different groups among the genotypes. Clusters II and IV were strikingly diverse from the rest of the clusters (inter-cluster D-value = 15.68); therefore, intercrossing the genotypes from these two clusters may create wider variability and is estimated to throw high yielding transgressive segregants in the mustard breeding program. The minimum inter-cluster D-value (8.30) detected between clusters I and IV showed the higher genetic similarities between these clusters. Intra-cluster distance (D) revealed that cluster II showed maximum intra-cluster distance (6.29) followed by cluster III (4.68). Owing to solitary genotype, clusters IV and V did not show intra-cluster distance. The magnitudes of the intracluster distances were not always proportional to the number of genotypes in the cluster, as intra-cluster distance in cluster I was found to be moderate (4.12) regardless of maximum genotypes (281).

DISCUSSION
The exploitation of genetic diversity present in a species can lead to the improvement of traits of the economic importance of mandated crops with the intervention of plant breeders to benefit the farmers and consumers (Salgotra et al., 2015). Yield is one of the most important economic traits and is the product of multiplicative interactions of contributing characters (Kant and Gulati, 2001). To combat these complex interactions, we need to have a multipronged strategy by combining agronomical and breeding approaches. Hence, the major objective of the mustard improvement program is to develop varieties with high yield potential through the introgression of various yield component traits from the lines with high trait values. Hence, exploitation of germplasm lines to identify lines with higher trait values is of prime importance.
Brassica juncea is a major oilseed crop of Rabi (winter) season and is highly sensitive to weather parameters such as temperature and solar radiation, which affect growth, phenological events, and crop yield (Kumar, 2005). Intermittent rains during the flowering time of 2018-2019 caused substantial yield losses by physiological disorder and biotic stresses. Therefore, a large variation in yields from year to year can be attributed to the weather conditions. Mustard prefers moderate temperatures between 18 and 25 • C with an optimum around 20 • C and moderate rain of approximately 25-40 cm during the growth period (Bhatt et al., 2015). The sensitive periods for mustard crop growth signify emergence, flowering, siliqua formation, siliqua filling, and physiological maturity. The analysis revealed that maximum and minimum temperatures had a positive effect on the yield during the sensitive period in both seasons, whereas total rainfall had a negative effect on the mustard yield during 2018-2019. The intermittent rainfall resulted in high RH (>92%) with T max ranging from 18.7 to 24.4 • C recorded from 4 to 10 standard meteorological weeks during the reproductive period of the plant (88-128 DOS) resulted in subsequent yield loss.

Mean Performance and Variation for Phenotypic Traits
Seed yield and related traits showed wide phenotypic variations during both seasons. The mean performance indicated the existence of enormous variability for the seed yield and related components, which offer greater opportunities for utilizing these traits in future breeding programs (Kumar et al., 2020). The genetic variation available for traits such as total seed yield per plot, total siliquae count, plant height at first branching, etc., can meet the breeding objective in evolving a high-yielding B. juncea cultivar. The greater variability observed in the present study could be due to the use of genotypes from diverse geographical origins. The lowest coefficient of variation for the number of days to maturity (1.80%) showed its best genetic potential and genetic influence, whereas the highest coefficient of variation for harvest index (30.0%) showed more influence of environmental fluctuations (Khan et al., 2008). Some of the yield traits such as total siliqua count, seed yield per plant, total seed yield per plot, biological yield, and harvest index showed a greater CV above 20%, which may be due to the longer duration and photoperiod sensitivity of some exotic lines, especially the Canadian lines. Characters with extensive genetic variability provide a better opportunity for selection instead of those with a narrow range of variability. Ali et al. (2003); Yadava et al. (2011), andRoy et al. (2018) had also found significant genetic variation as indicated by range for different seed yield-contributing characters in Indian mustard but comparatively lower than the present study.
The analysis of variance revealed highly significant differences for all the characters representing the presence of variability, which can be utilized through genetic improvements. Significant variance due to G × E interactions for all the 20 characters confirmed that the genotypes respond differently in diverse environments. Therefore, it is possible to exploit different environments by developing environmentally specific varieties from the diversity panel. The importance of G × E interactions had also been observed by Gunasekara et al. (2006); Kumar et al. (2012), Priyamedha and Haider (2017), and Kumari et al. (2019) in Indian mustard and canola (B. napus) for seed yield.

Significance of Heritability and Other Genetic Parameters for the Selection of Traits
The observed variation in a population may be either due to genetic or environmental or both. Only those due to genetic components remain heritable. Heritability alone does not infer the estimate of the actual amount of genetic gain in the selection program, as it is also inclusive of  Seed yield per plant SY/P >30 g IC-597867 (IV), PBR-210 (Id), RE-7-1 (Ic) *Acceptable limit for each trait is fixed based on the various mustard ideotype concept given by Bhargava et al. (1984), VijayaKumar et al. (1996), DUS guidelines, Thurling (1991), and Yadav et al. (2017). Parenthesis indicates the subcluster details. nonadditive genetic factors (Shivanna, 2008). All the yield traits in the current study were highly heritable in individual seasons. Nevertheless, a partitioned genotype by environment interaction decreased the heritability across environments (pooled analysis). This type of reduced heritability across the environment was also reported by Phuke et al. (2017) in Sorghum.
The study showed that high broad-sense heritability, genetic advance as percentage mean, and GCV were observed in flowering traits such as days to flowering initiation, days to 50%   flowering, days to 100% flowering, plant height at flowering initiation, plant height up to the first primary branch, seed size, and seed yield/plant. The very high heritability of seed yield per plant (63%) in pooled analysis with a high GA of 60.8% indicated that the results would be repeatable and rewarding over generations of selection cycles, which is a boost for the breeding program. This concurred with previous studies (Kumar and Misra, 2007;Yadava et al., 2011;Tripathi et al., 2013;Meena et al., 2017;Kumar et al., 2019). These results indicate a greater scope for selection to improve seed yield per se in the Brassica breeding program (Tiwari et al., 2017). Estimates of heritability for yield component traits varied from low (22%) to high (85%). There is a need to identify the traits that should be targeted for improving the seed yield in mustard. Flowering traits, plant height traits, and seed size showed high heritability and high GA, as few of them are governed by a few major quantitative trait loci reported earlier by Kaur and Banga (2015) and Akhatar et al. (2021). The high value of heritability and moderate genetic advance for plant height at maturity indicated that improvement in this trait could be made through the selection to some extent. High genetic advance and moderate heritability were shown in the number of primary branches, total siliquae count, total seed yield/plot, and biological yield/plot in which both additive and nonadditive gene actions may be expressed. A parameter having low heritability coupled with high genetic advance revealed the additive gene effects of traits (Belete et al., 2011). The low heritability is due to high environmental effects, and selection per se may be ineffective for such traits as harvest index in the present study. None of the traits exhibited low heritability with low genetic advance.
Higher PCV values than their corresponding GCV for most of the traits showed the considerable role of environment in the expression of these traits; therefore, the variation in the genotypes is due to both genotype and the environment (Kumar et al., 2015). The wide difference between PCV and GCV was detected for plant height up to the first primary branch, the number of secondary branches, total siliquae count, seed yield/plant, total seed yield/plot, biological yield/plot, and harvest index, which indicated the high contribution of environmental variance to the phenotypic variance.

Association Among Traits
Correlation analysis indicated that seed yield per plant was significantly correlated with biological yield/plot, total seed yield/plot, seed size, plant height up to the first branch, total siliqua count, days to flowering initiation, plant height at maturity, siliqua length, and siliqua on the main shoot, which implies that selection in improving these traits would lead to improved seed yield (Rauf and Rahim, 2018). Genotypic correlations involving flowering characters such as days to flowering initiation, days to 50% flowering, days to 100% flowering, days to flowering completion, days to maturity, and plant height characters such as plant height at flowering initiation, plant height up to the first primary branch, and plant height at maturity with seed yield for plants in studied genotypes were negative, indicating selection for these traits would decrease seed yield (Joshi et al., 2009;Yadava et al., 2011). Reduction in flowering days prevents plant exposure toward high temperature that builds up during the late growth periods and consequent reduction in yield due to sterility and shriveling of seeds (Azharudheen et al., 2013). Also, reduced plant height makes plant architecture more compact to utilize the source toward increment in yield. These attributes can serve as marker characters for seed yield improvement in mustard. According to Kardam and Singh (2005), characters such as height up to the first branch, seed yield/plant, number of primary branches, number of siliquae per plant, and seed size have been reported as main yield contributing traits.
For instance, the number of secondary branches with very low heritability was significantly positively correlated to high heritability traits viz. total siliquae count (r = 0.57) and biological yield/plot (r = 0.33). Therefore, the selection of genotypes with higher siliqua count and biological yield/plot would indirectly improve the number of secondary branches per plant in successive generations. This is in accordance with the findings of Prasad et al. (2001), Swarnkar et al. (2002), andSingh et al. (2002).
Regression is a method for automatic selection in a stepwise manner based on partial correlations of a dependent variable with the independent variables near to optimal in the sense of maximizing the squared multiple correlations coefficient (R 2 ) of the dependent variable (Card et al., 1988). Based on regression studies, biological yield/plot, total seed yield/plot, plant height up to the first primary branch, seed size, total siliquae count, number of days to flowering initiation, plant height at maturity, siliquae on the main shoot, and siliqua length were the most contributing traits for seed yield per plant. However, these independent traits with individual R 2 of less than 20% had only a negligible direct contribution to the seed yield per plant, although they had a significant association with the dependent variable. This indicates that those traits that had less direct influence cannot be ignored because their cumulative contribution to seed yield/plant could be highly influential (Maphumulo et al., 2015). These identified traits in combination could be used as effective indicators in Indian mustard for the calculation of yield performance; hence, a selection index based on identified influential traits could realize higher genetic advances than selecting seed yield alone (Hussain et al., 2004;Sandhu et al., 2019).

Contribution Toward Seed Yield per Plant
Specific direct and indirect effects of traits and relative importance of each trait in determining the key goal, i.e., seed yield, was accompanied through path coefficient analysis (Albayrak and Tongel, 2006). Path analysis that showed total seed yield/plot, siliquae on the main shoot, and seed size had a highly positive correlation and moderate direct effect on seed yield per plot, which suggested that selection for these traits would be quite effective for improving seed yield in mustard. Traits such as plant height up to the first primary branch and plant height at maturity had a negative moderate direct effect. Similar conclusions were reported by Kardam and Singh (2005) and Sandhu et al. (2019). Indirect effects were ranked similar to those of Lenka and Mishra (1973), as follows: 0.00-0.09 = negligible, 0.10-0.19 = low, 0.20-0.29 = moderate, and >0.30 = high path coefficients. Plant height at maturity toward dwarf plant type exhibited negligible indirect effects on seed yield, indicating that they had little contribution to seed yield. The rest of the traits were low to moderate, showing that indirect selection for these traits would improve the yield of the mustard. Hence, for improving the seed yield per plant in mustard, one should focus on "selecting for" traits such as total seed yield per plot (under crop community), more siliquae on the main shoot, bold seed size, and highest total siliquae count per plant and "selecting against" plant height up to the first primary branches (i.e., selection for basal branching genotypes). Traits such as days to flowering initiation, plant height up to the primary branches, siliquae length, and biological yield have significant indirect effects via component traits toward seed yield per plant. The traits as mentioned earlier with high direct effects inferred from path analysis such as total seed yield/plot, seed size, siliqua on the main shoot, plant height up to the first primary branch, and plant height at maturity had moderate to high heritability (≥30%) coupled with high GA% of the mean (>20%). These traits also showed a highly significant (p ≤ 0.001) correlation with seed yield/plant. Table 5 summarizes promising genotypes identified based on the cumulative performance for yield and associated traits that can be a guide in bringing high-yielding ideotypes in mustard for the entire mustard breeding community.

Diversity Analysis
The advantage of genetic diversity analysis based on Mahalanobis D 2 distance over the Euclidian distance is that it can take account of the correlation between a highly correlated variable and can scale the contribution of individual variables to the distance value according to the variability of each variable (Ghorbani, 2019). The Mahalanobis D 2 distance among genotypes in the diversity panel ranged from 0.0 to 351.38, which was huge and higher than previous reports by Bind et al. (2015); Gupta et al. (2015), and Chandra et al. (2018). The huge extent of genetic diversity in the present study was due to the involvement of genetic material from four continents across the globe. Furthermore, the number of genotypes studied was higher compared with the previous reports. Most of the genotypes (281/289) got included in a single cluster and the rest of them in four different clusters. The cluster forming point was having a very high D 2 -value = 232. It suggests that the eight genotypes included in clusters II to V were more diverse than cluster I. Also, a detailed analysis of cluster I also suggested that there were 26 subclusters with a cluster-forming point at a D 2 value of 20. Intra-cluster distance of cluster I (D 2 = 17) was much higher than earlier reports related to D 2 clustering studies by Yadava et al. (2011) and Kumari and Kumari (2018), indicating a wider genetic base of materials within cluster I in the present study. The subcluster I (d-i) included most of the high-yielding cultivars and breeding lines of Indian origin. The rest of the subclusters consisted of exotic and indigenous gene pools received in various bilateral collaborative projects utilized in mustard improvement by various researchers (Chauhan et al., 2011). Subcluster I (j-z) consisted of lines having more height and longer duration with various oil quality traits, which remain unexploited due to lack of synchrony in flowering time. Still, there was no obvious clustering pattern related to geographic proximity and use types among mustard. Grouping of certain improved varieties and cultivars along with Canadian and Australian genetic stocks and indigenous collections from Arunachal Pradesh (India) indicated that the geographical distribution need not necessarily be the indicator of genetic divergence as reported by Verma and Sachan (2000) and Jeena and Sheikh (2003). The possible reason could be the common ancestry of these genotypes, which permitted the free exchange of germplasm among the breeders of different locations and/or the unidirectional selection experienced by breeders in tailoring the promising cultivars for different locations (Yadava et al., 2011;Mukesh Sankar et al., 2014).
Broad variability in the current material holds great promise to use these genotypes from different clusters (such as II and IV) for future breeding programs. The highest-yielding genotype (IC-597867) identified in the present study was present in cluster IV, whereas genotypes with a high harvest index were contained in cluster II. So, the improved cultivars from cluster I can be utilized to exploit the genotypes present in clusters II and IV for further yield increments and genetic diversification through hybridization.

CONCLUSION
The study assessed a diversity panel representing 289 genotypes across four continents for the existence of genetic variability for seed yield and yield-related traits over two seasons. The results revealed an enormous genetic variability for the traits under study, which can be exploited to acquire further breeding gains. The use of BLUP values for genotypes provided higher selection accuracy by reducing residual error, which permitted the identification of potential genotypes for the Brassica improvement program. Flowering and plant height-related traits were found to be more heritable, although these were negatively correlated with yield. High PCV, GCV, and genetic advance values with low to moderate heritability were observed for total seed yield, which indicated that the yield performance still needs to be improved to produce superior varieties. Moreover, the current study could aid breeders in enhancing the seed yield by considering the traits that have a good correlation with seed yield. Cluster analysis revealed that genotypes under study were more diverse, which could be utilized for future hybridization programs, and it can release transgressive segregants for economic trait improvement. The greater variability among the studied genotypes and the association between seed yield and secondary traits in the current study suggested the exploitation of superior genotypes in the near future.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
DY conceptualized, supervised the research, and contributed to reviewing the manuscript. RS and SLS performed the field trials and prepared the original draft manuscript. SS performed firstyear field trial. SM conducted the data analysis, curated the data, and contributed in reviewing the manuscript. RC helped in field trials and data entry. YP and NS coordinated the study and revised the manuscript. SV administered the project and revised the manuscript. All authors read the manuscript and agreed with its content.

ACKNOWLEDGMENTS
The authors pay their sincere gratitude to the Director, ICAR-IARI, New Delhi, for his consistent support and for providing all the experimental field facilities for the research work. RS is thankfully acknowledging ICAR-IARI New Delhi for providing IARI Merit Fellowship. Thanks to the Department of Agricultural Physics, ICAR-IARI, New Delhi, for providing the meteorological data of the experimental sites.