Genetic characterization of root architectural traits in barley (Hordeum vulgare L.) using SNP markers

Increasing attention is paid to providing new tools to breeders for targeted breeding for specific root traits that are beneficial in low-fertility, drying soils; however, such information is not available for barley (Hordeum vulgare L.). A panel of 191 barley accessions (originating from Australia, Europe, and Africa) was phenotyped for 26 root and shoot traits using the semi-hydroponic system and genotyped using 21 062 high-quality single nucleotide polymorphism (SNP) markers generated by genotyping-by-sequencing (GBS). The population structure analysis of the barley panel identified six distinct groups. We detected 1199 significant (P<0.001) marker-trait associations (MTAs) with r2 values up to 0.41. The strongest MTAs were found for root diameter in the top 20 cm and the longest root length. Based on the physical locations of these MTAs in the barley reference genome, we identified 37 putative QTLs for the root traits, and three QTLs for shoot traits, with nine QTLs located in the same physical regions. The genomic region 640-653 Mb on chromosome 7H was significant for five root length-related traits, where 440 annotated genes were located. The putative QTLs for various root traits identified in this study may be useful for genetic improvement regarding the adaptation of new barley cultivars to suboptimal environments and abiotic stresses.


Introduction
Barley (Hordeum vulgare L.) is the fourth most widely grown cereal crop worldwide after maize, rice and wheat (Schulte et al., 2009).Barley is increasingly used as an important source of nutrition for humans, although it has been used traditionally as animal feed and in the brewing industry.Intensive breeding for more than a century has resulted in a wide range of barley genotypes that are well adapted to various climatic conditions, with barley being considered a model species for temperate cereals (Badr et al., 2000;Schulte et al., 2009).However, genetic diversity is presumed to have declined in modern barley cultivars, forcing the introduction of exotic germplasm as one of the major genetic resources in innovative modern breeding programs (Ellis et al., 2000).
Roots are essential to plants for a variety of processes, including uptake and storage of water and nutrients, and support and anchoring of above-ground plant parts (Aroca et al., 2005).Root system architecture (RSA) plays an important role in soil exploration and nutrient accumulation (Rose et al., 2009;Chen et al., 2016).The adaptive characteristics of roots (especially in cereals) to withstand harsh environments have been investigated only to a limited extent due to (i) lack of knowledge on root system growth and functioning, (ii) difficulties in measuring relevant traits, and (iii) non-availability of an efficient root screening method (Richards et al., 2002).Compared to other cereals, barley has a higher degree of adaptability under different types of stresses (e.g.heat and drought), likely due to early root development and an extensive root system (Maurer et al., 2018).In addition, barley roots can grow to 2-meter depths in soils such as loams or deep sands (Gahoonia et al., 2001).
Molecular markers have been used widely to dissect complex traits in different species (Mishra et al., 2014;Xue et al., 2019;Mehrabi et al., 2020;Oyiga et al., 2020;Salarpour et al., 2020).Advanced genotyping technologies such as genotyping-bysequencing (GBS) are powerful methods to discover molecular markers using next-generation-sequencing (NGS) and allow rapid identification of genomic regions associated with the traits of interest (e.g.Elshire et al., 2011;Mehrabi et al., 2020;Oyiga et al., 2020).Recently, high-throughput NGS-based single nucleotide polymorphisms (SNPs) have been found to be efficient markers for genetic studies and genomic selection in breeding programs because of the low cost of acquiring a large number of markers and their co-dominant nature (Close et al., 2009).The first set of barley SNP markers was developed in elite European varieties (Kota et al., 2001).Chloupek et al. (2006) used SNP markers to study root system size in a panel of diverse genotypes and a double haploid (DH) population of barley.Naz et al. (2014) and Reinert et al. (2016) found important quantitative trait loci (QTLs) related to root and shoot traits in the different sets of barley germplasm.
Association genetics offers a powerful approach to identifying genetic variants that control complex traits and can be used to explore genetic variations associated with various traits across genotypes (Zhou et al., 2016;Mehrabi et al., 2020;Salarpour et al., 2020).Recently, Reinert et al. (2016) reported a GWAS using 5892 SNP markers and identifying QTLs that control barley root traits.Using the same strategy, Oyiga et al. (2020) found 11 QTLs associated with barley root architectural and anatomical responses to drought stress.However, the power and resolution of GWAS depends strongly on the extent of linkage disequilibrium (LD) across the whole genome of tested cultivars (Gupta et al., 2005).Various factors can affect LD, including allele frequency, population structure, mating system and admixture (Flint-Garcia et al., 2003).Therefore, a thorough understanding of population structure and LD patterns across the genome is critical to assembling a diversity panel for association studies.
We used an association panel containing 191 barley genotypes, mainly advanced breeding lines and key parental germplasm lines developed in the InterGrain Pty Ltd (Bibra Lake, Western Australia) barley breeding program between 2011 and 2015.The panel was phenotyped for root traits and genotyped using GBS-SNP markers.The objectives of this study were to (i) evaluate genetic variation in root traits in the panel, (ii) estimate the LD decay, and (iii) identify the major genetic regions responsible for various barley root traits that could be used in breeding for improved barley root architecture.

Plant materials and phenotyping
The 191 genotypes with 101 different pedigrees and of different origins (Australia, Europe and Ethiopia), with most (126 genotypes) being InterGrain advanced breeding lines, were used in this study (Supplementary Table 1).The phenotyping experiment was performed in a glasshouse at The University of Western Australia using an established 1.0-m-tall semi-hydroponic system (Chen et al., 2011a;Chen et al., 2011b;Chen et al., 2012) and a randomized complete block design with three biological replicates and two plants per replicate.The phenotypic data are reported elsewhere (Wang et al., 2021).Shoot height and leaflet number in each plant were determined at harvest.After harvesting, shoots and roots were separated, and root tissues were subsampled further by cutting into 20-cm sections (starting from the crown) to determine root morphology as described previously (Chen et al., 2011a;Chen et al., 2011b;Chen et al., 2012).
In addition to the parameters measured, root length ratio (RLR) and specific root length (SRL) were calculated (Table 1) for markertrait association (MTA) analysis as follows: RLR = root length at 0 À 20 cm depth=root length below 20 cm depth SRL = root length=root drymass (m g À 1 )

DNA extraction and SNP assays
Genomic DNA of the barley samples was extracted from fresh leaf tissue of 2-week-old seedlings as described by Kotchoni and Gachomo (2009).A GBS library was constructed for 191 barley DNA samples at Kansas State University, Manhattan, KS, USA, following Poland et al. (2012).In brief, DNA samples were digested with the PstI-HF (high fidelity) and MspI restriction enzymes (New England BioLabs Inc., Ipswich, MA, USA), and ligated to barcoded adapters and a common 'Y' adapter using T4 DNA ligase (New England BioLabs Inc.).All ligation products in the two 96-well plates were pooled and cleaned up using a QIAquick PCR Purification Kit (Qiagen Inc., Valencia, CA, USA).Primers complementary to both adapters were used for PCR.The PCR products were then cleaned again using the QIAquick PCR Purification Kit and size-selected for a range of 250-300 basepairs (bp) in an E-gel system (Life Technologies Inc., Carlsbad, CA, USA).DNA concentration was estimated using a Qubit 2.0 fluorometer and a dsDNA HS Assay kit (Life Technologies Inc.).The size-selected library was sequenced on an Ion Proton semiconductor sequencer (Life Technologies Inc.).
SNPs were called using both the reference-based pipeline (Li et al., 2009) and the Universal Network-Enabled Analysis Kit (UNEAK) pipeline in Trait Analysis by aSSociation, Evolution and Linkage (TASSEL) v.5 for SNP/variant discovery (Bradbury et al., 2007).Raw sequence reads from the Ion Torrent system with variable lengths were supplemented with poly-A at their 3' ends to ensure all reads had lengths of at least 64 bp as required by the TASSEL software.TASSEL sorted all the reads by barcodes and auto-trimmed them to 64 bases.Bi-allelic SNPs were determined by querying the filtered tags for pairs of sequences (Poland et al., 2012) if they differed in only one or two SNPs.SNPs called by two pipelines were merged, and duplicated SNPs were removed.Only the SNPs that were present in at least 80% of the genotypes in the population were used for further analysis.SNPs with a minor allele frequency<0.05and heterozygous in >10% of the accessions were excluded from the analysis (Supplementary Figure 1).

Population structure and linkage disequilibrium
Population structure of the bi-parental multi-locus genotypes was analyzed using the Bayesian clustering procedure implemented in STRUCTURE 2. 3.4 (Falush et al., 2003).Initial STRUCTURE runs were carried out with a length of 100 000 burn-in-periods and 10 000 MCMC (Markov Chain Monte Carlo) iterations by increasing K-values from 1 to 10.The most probable number of groups was determined by plotting the estimated likelihood values LnP(D) obtained from STRUCTURE runs against five repetitions of K values.LnP(D) is the log likelihood of observed genotype distribution in K clusters as calculated by STRUCTURE.The K value best describing the population structure was based on the criteria of maximizing DK (Evanno et al., 2005) using Structure Harvester v.0.6.94 (Earl and VonHoldt, 2012).
A proportion of the phenotypic variation explained by the model was assessed using correlation coefficients (r).Principal component analysis (PCA) was done by using "FactoMineR" in R v.4.1.2to evaluate the population structure, whereas cluster analysis was conducted using the kinship matrix elements as similarities by utilizing all given root and shoot traits data.The resulting output was visualized through the use of kinship matrix, with the aim of uncovering population information using Genome Association and Prediction Integrated Tool (GAPIT) (Lipka et al., 2012).To assess the accuracy in genotypic selection, an imputation was done using LD-kNNi by introducing k-nearest neighbor to evaluate the SNPs having strongest LD in genomic data (Money et al., 2015).In LD-kNNi, it is not necessary to refer to physical linkage, but rather to the correlation between any two SNPs in the dataset.The LD pattern across seven chromosomes of barley was investigated using TASSEL 5.2.44.The outputs from TASSEL (using sliding window of 50 SNPs and maf = 0.05) were used to generate LD decay plots using R.The LD decay explained by r 2 was determined using 21 062 highquality GBS-SNPs.The extent and distribution of LD were visualized by plotting intra-chromosomal r 2 values (at p<0.001) against the physical distance in base pairs using 1000 permutations.The estimation of LD decay using LOESS (locally estimated scatterplot smoothing) involved the determination of genetic distance at which the LOESS curve initially intersects the baseline r 2 value.The average LD decay of the panel was utilized to determine the point of intersection between the LOESS curve and the critical r 2 .Unlinked r 2 estimates were square-root transformed to approximate a normally distributed random variable, and the parametric 95 th percentile of that distribution was taken as a critical value of r 2 indicated in blue lines (Supplementary Figure 4), beyond which LD was likely caused by genetic linkage (Breseghello and Sorrells, 2006).

Marker-trait association
Marker-trait association analysis was done by utilizing various GWAS (GLM, MLM, MLMM, SUPER, and BLINK) with 26 root and shoot traits (Table 1) using R package GAPIT v.3.0 (Lipka et al., 2012).In comparison, Bayesian information criterion (BIC) and Linkage-disequilibrium Iteratively Nested Keyway (BLINK) model was better fit as it has demonstrated a reduced incidence of false positives, improved detection of true positives, and the ability to handle extensive datasets.BLINK is an improved model version of Fixed and Random Model Circulating Probability Unification (FarmCPU) and is statistically powerful and efficient in identifying significant SNPs associated with a trait of importance; hence, BLINK has been used in the present study.Using the panel of 21 062 SNP markers, we estimated random and fixed effects in order to reduce the rate of false positives in the BLINK model (Huang et al., 2019).The optimization was performed using BIC, which is twice the negative log likelihood plus the three penalties on number of parameters, using the equation BIC = -2 ln(L) + 2 k ln(n), where lnL is log likelihood, k is the number of pseudo QTNs (quantitative trait nucleotides), and n is the number of individuals.To reduce false MTAs due to population structure and relatedness, the mixed model incorporating principal components and a kinship matrix was used in the R package GAPIT v.3.0 (Lipka et al., 2012).Model correction for population structure and cryptic relatedness between lines was based on a compressed Efficient Mixed-Model Association (EMMA) kinship matrix with Q-values from STRUCTURE K=6 included as an efficient fixed effect.Manhattan plots showing positions of associated markers across the genome were constructed for each trait using the R package CMplot (Yin, 2018).The false discovery rate (FDR)-adjusted p-values used in GAPIT were found to be highly stringent.The FDR correction was performed using the Benjamini-Hochberg method to obtain a qvalue (FDR-adjusted p-value) (Benjamini and Hochberg, 1995).A threshold of q ≤0.05 was used to claim significant MTAs, after Bonferroni multiple test correction which corresponds to p ≤0.001 significant level used in BLINK model.Markers were clustered into one locus based on LD decay (2.97 Mb), and the locus was categorized as a QTL.
GBS-SNPs closely associated with QTLs were mapped to physical positions (Mb) by blasting the marker sequences through the GrainGenes database against the barley reference genome assembly Hordeum vulgare (Barley Morex V3) developed by the International Barley Genome Sequencing Consortium (Mascher et al., 2021).Furthermore, MapChart v.2.0 (Voorrips, 2002) was used to depict the linkage map and detected QTLs based on their physical positions (Mb).The functional annotation of potential candidate genes given in Table 2 was retrieved from the online tool BARLEYMAP (http://floresta.eead.csic.es/barleymap/)(Cantalapiedra et al., 2015).

Evaluation of root traits
The statistical evaluation of the phenotypic data from 189 accessions (excluding the lines IG15RT_0166 and IG15RT_0171 that failed to germinate during phenotyping) was presented elsewhere (Wang et al., 2021).Out of 26 root and shoot traits evaluated, 16 traits (13 root and three shoot traits) showed highly significant differences among genotypes, with coefficient of variation (CV) values ≥0.25 (Wang et al., 2021).For example, the average length of longest root was 113 ± 15 cm (mean ± SD), ranging from 60 to 158 cm across genotypes.

Population structure, principal components, and genetic relatedness
The panel of 191 barley germplasm lines generated more than 169 million sequence reads and 3 407 301 total SNP datapoints.Apart from 12 samples that generated high missing data rates due to poor DNA quality, all other samples produced high quality GBS-SNPs (5951 from UNEAK and 15 111 from the reference-based pipeline) with ≤20% missing data and were used for further analysis.
To analyse population structure, the clustering parameter K (1 to 10) was used to group genotypes based on genetic relationships from GBS-SNPs.The most probable number of structured subpopulations (K value) and the optimal K value were found by graphing DK against K, which showed a sharp peak at K = 6.Hence, six distinct groups were identified with a mean log likelihood [LnP (D)] value of -165 938, ranging from -212 480 (K = 1) to -168 367 (K = 6), with the highest DK value at K = 6 (Figure 1; Supplementary Figure 2; Supplementary Table 1).The LnP(D) values increased gradually from K=1 with some variation among K values, with K = 6 being the predicted number of major clusters in the barley panel.With the rise in K to the optimal K value, a steady improvement in the evaluated Ln P(D) was seen, indicating that six subpopulations contained all 191 genotypes with the greatest probability.These six sub-populations had the following numbers of genotypes: G1 (21), G2 (24), G3 (13), G4 (74), G5 (19), and G6 (40) (Supplementary Figure 2; Supplementary Table 1).Consistent with Pasam et al. (2012), these sub-populations of barley genotypes were clustered based on row types and geographic origin.A heatmap of the kinship matrix was computed using 21 062 SNPs to illustrate genetic relatedness among the barley genotypes in the panel, and a hierarchical tree was generated based on the kinship values (Supplementary Figure 3).Significant variation in correlation coefficients (from -0.16 to 1) between pairs of individual genotypes was observed in the kinship matrix.The genotypic groupings based on the population structure were also supported by the EMMA kinship algorithm.In the LD panel, the kinship values from EMMA matrix and eigenvalues from PCA were used to account for the relationships among the barley lines.The principal components (PC) were calculated from GBS-SNP markers data, with eigenvalues from 0.54 to 1.95.The principal component analysis (PCA) produced a set of three principal components, out of which PC1 and PC2 (with eigenvalues ≥1) accounted for a combined variance of 69.6% (Figure 2).Furthermore, the PC1 (xaxis) accounted for 45.7% of the explained variation, whereby the genotypes IG15RT-0004, IG15RT-0007, and IG15RT-0014 (InterGrain advanced breeding lines) exhibited a variation greater than 20%.Barley genotypes IG15RT-0001, IG15RT-0008, IG15RT-0011, IG15RT-0012, and IG15RT-0015, predominantly sourced from advanced breeding lines, accounted for a phenotypic variation exceeding 10%, with PC2 (y-axis) explaining 23.9% of the variation.
A phylogenetic analysis of 191 barley genotypes was carried out to evaluate whether the population structure could be inferred from the genome-wide genotypic data.Barley genotypes in the panel were clearly separated into different groups based on the Dice similarity coefficient values generated using the kinship matrix.A phylogenetic dendrogram was created using GBS-SNPs from the pairwise distances of 191 barley genotypes based on the root and shoot traits, which clustered all the genotypes into 10 phylogenetic groups (Supplementary Figure 3), with accession IG15RT_0147 being a single member of group 10.Compared with population structure grouping, the kinship matrix based on EMMA algorithm analysis (Supplementary Figure 3) clustered most genotypes in the groups 2 and 4 (containing 71 and 43 genotypes, respectively).Clustering groups 3 and 8 each comprised 10 barley genotypes, whereas groups 1 and 5 contained 16 and 20 genotypes, respectively.The remaining genotypes clustered into small groups 6 (3 genotypes), 7 (7), 9 (8), and 10 (1 genotype).Generally, all barley lines with the same pedigree were clustered together as expected, but with a few exceptions.The lines IG15RT_0176 and IG15RT_0161 that had the Flinders/Fathom pedigree were clustered into groups 2 and 3, respectively (Supplementary Figure 3).All lines with the Hindmarsh/5/ Hindmarsh/3/Hindmarsh/Scope//Hindmarsh/3/Hindmarsh pedigree were clustered into group 2, except line IG15RT_0011.The lines with the pedigree Vic9104/Dash/3/Skf/NsN//Onw/TR118/4/ Vlamingh were clustered into two separate groups (IG15RT_0053 in group 1 and IG15RT_0114 in group 3).The lines with the pedigree Vlamingh/Hindmarsh were clustered into groups 2 (IG15RT_0139) and 4 (IG15RT_0142).Out of barley lines with the WABAR2312/WABAR2334 pedigree, the only line clustered separately (in group 3) was 1GRT_0024.Similarly, among the lines with the pedigree WABAR2534/Lockyer, only IG15RT_0193 was separated into group 2.

Linkage disequilibrium and genomewide association analyses using root and shoot traits
The LD decay was estimated using 21 062 GBS-SNPs to calculate pairwise distances (bp) among LD sites.A principal component was included in the GWAS for population based on the Bayesian information criterion (BIC) and maximum log likelihood values, which were applied in the model selection option in GAPIT using BLINK model.We used an r 2 value rather than a normalized coefficient of LD (D′) because the r 2 value is a more reliable parameter for comparing two alleles in a population (Slatkin, 2008).Moreover, the r 2 values were negatively correlated with the physical distances (bp) between the loci.The intrachromosomal LDs were calculated using 754 275 significant pairs of SNP comparisons that had physical distances ranging from 0.24 kb to 767.30Mb across seven barley chromosomes (Figure 3).Significant LD was declared at p<0.001 and r 2 ≥0.1.All the 191 barley lines exhibited a low LD decay when evaluated at a physical distance of 2.97 Mb.The LD decay estimation revealed a mean r 2 value of 0.1 across the 191 barley lines, with a slightly higher r 2 on chromosomes 2H, 5H, and 7H (Supplementary Figure 4).In the barley genome, 14 596 pairs of SNPs had high LD scores (r 2 ≥0.8), most of which were on chromosome 3H.A strong LD was claimed at r 2 >0.5 (Zhou et al., 2012).Regarding the LD decay on each chromosome, 3H had the slowest LD (4.32 Mb), followed by 2H (3.74 Mb).By contrast, chromosome 1H, with an average LD decay of 1.71 Mb, had the fastest LD decay.The LDs on the remaining chromosomes 4H, 5H, 6H, and 7H were 2.94, 2.68, 2.64, and 3.21 Mb, respectively.
The GWAS study employed the BLINK model to identify chromosome regions that exhibited significant MTAs for a total of 26 root and shoot traits.The statistical description of each studied trait was presented in Supplementary Table 2.A total of 1199 non-unique significant markers (p ≤0.001) were identified across all 26 root and shoot traits, with r 2 values ranging from 0.08 to 0.41.The traits of topsoil root biomass (RB_top) and 20-100 cm deep root length (RL_lower) showed the strongest MTAs with r 2 values of 0.41 and 0.36, respectively.The strongest MTA was observed for diameter of roots in 0-20 cm depth [RD_top, -log10 (p-value) 17.88] at the physical position of 634.09Mb on chromosome 3H (Table 2).The status of significant MTAs was plotted against the threshold level of expected -log10 (p-value) as shown in the QQ-plots (Supplementary Figure 5).The highest number of significant MTAs (217 MTAs) was observed for diameter of roots in the 0-20 cm depth (RD_top), followed by root diameter (145 MTAs) in the 20-40 cm depth (RD_20), and length of thick roots (>0.25 mm diameter, i.e., diameter class length thick, DCL_thick) with 100 MTAs.By contrast, the lowest number of MTAs (15 MTAs) was observed in roots with diameter 0.075-0.25 mm (DCL_medium).In comparison, fewer significant MTAs were observed for shoot-related traits SH, Till, and SB (with 85, 56, and 26 MTAs, respectively).The genome-wide SNP markers that were most strongly (p ≤0.0001) associated with root and shoot traits are listed in Table 2.
The functional annotations for the sequences that carry most significant GBS-SNPs were identified by comparing physical locations (Mb) of the sequences that harbor SNPs with the annotated barley reference genome (Mascher et al., 2021).Proteins such as WRKY transcription factor protein family, glycogen synthase, auxin efflux carrier family protein, and thioredoxin superfamily protein were identified in these regions (Table 2), and they may play a role in barley root development (see also Abdel-Ghani et al., 2019).

Possible QTL regions for root and shoot traits
Multiple trait-associated markers in a short chromosome region may indicate that region as a possible QTL.All the root and shoot traits were analyzed to identify the putative QTL regions based on the physical chromosome positions of their trait-associated markers and the assumption that significant trait-associated SNPs within a 10-20 Mb physical interval in a chromosome (e.g.Jia et al., 2021) are linked to the same QTL (the genome-wide LD decay was 2.97 Mb).After determining physical locations of the 1199 significant traitassociated SNPs in the barley reference genome assembly (Barley Morex V3) (Supplementary Table 3), 271 highly significant MTAs were located in 40 chromosomal regions (QTLs), having five to 35 trait-associated SNPs in each QTL region.Out of these 40 QTL regions, 37 were associated with root traits, and only three were associated with shoot traits (Figure 4).The putative QTL regions associated with 19 traits (17 root and two shoot traits) were identified across six barley chromosomes (except 6H), having varying numbers of putative QTLs from two (1H) to 15 (7H).These QTLs had physical intervals from 0.53 to 10.2 Mb (Figure 4; Table 3).
The QTL regions were associated with 19 root and shoot traits, of which nine traits had only one QTL each; by contrast, root diameter of 0-20 cm deep (RD_top) and the longest root length (LRL) were each mapped to five QTL regions.Three putative QTLs were found for DCL_thick, DCL_thin, RB, RD_20, and RLR, whereas two QTLs were detected for RL_40, SH, and TRL (Table 3; Supplementary Table 3).Furthermore, nine QTLs were significant for at least two traits.The QTLs on 7H were significant for 13 traits, including root length, volume, diameter, biomass, and shoot height.These QTLs explained 6 to 15% of the phenotypic variation in different root and shoot traits (Table 3; Supplementary Table 3).
For a QTL region to be considered a hotspot due to the pleiotropic effects, at least two root and shoot traits were required to be linked to that particular region.We found nine hotspot QTLs in total, among which the 2H region from 714.78 to 727.98 Mb was important for five barley root traits (DCL_thick, TRL, RL_40, SRL, and RB) (Table 3; Supplementary Table 3).Out of the nine QTL regions that influenced multiple traits, the genomic region 640.72-651.91Mb on 7H contained 49 MTAs for five root traits (qDCL_thick_7_2, qRL_40_7_1, qRLR_7_2, qRL_top_7_1, and qTRL_7_1) within the same region and about 440 annotated candidate genes.The remaining genomic regions on 1H (498.2-506.76Mb), 3H (628.92-634.09and 681.14-688.60Mb),  were associated with two to three traits each; therefore, these regions may be important for root architecture of barley.

Discussion
The root is the first plant organ to sense edaphic stress conditions, and thus plays a key role in plant responses to stress stimuli.An efficient phenotyping system has been developed to examine intrinsic genetic variation in barley root architecture for selecting superior root traits in breeding programs (Chen et al., 2011a;Chen et al., 2011b).In the current study, 13 out of 23 root architectural traits showed a variable growth response in the semihydroponic system based on CV ≥0.25 with the lowest p-value (<0.001) (Wang et al., 2021).The 13 root parameters represented the five major root traits (biomass, surface area, volume, length, and diameter).Previous studies indicated root diameter, length, and dry weight strongly influenced the physiological activities related to barley nutrient uptake (Heydari et al., 2018); the increased root area, volume, biomass, length, and diameter traits increased water uptake QTLs detected on six barley chromosomes.The markers highlighted in red are unique among QTLs on the same loci.
TABLE 3 Significant quantitative trait loci (QTL) for 16 root traits and three shoot traits and their physical locations (Mb), number of significant traitassociated SNPs and proportion of phenotypic variation explained (PVE) by the QTL in the barley association panel.and improved stress tolerance (Manju et al., 2019).Furthermore, long roots can protect plants against stresses such as cold or drought (e.g.Hasanuzzaman et al., 2019;Oyiga et al., 2020); therefore, these five major root traits were considered the key root architectural traits in GWAS.SNP markers generated by high-throughput genotyping technologies have been used for the genetic dissection of root architectural traits in many crops, such as maize (Sanchez et al., 2018;Zheng et al., 2020), wheat (Golan et al., 2018;Mehrabi et al., 2020;Salarpour et al., 2020), sorghum (Parra-Londono et al., 2018;Zheng et al., 2020), and rice (Li et al., 2017).They were also used to identify genetic regions controlling development of root traits in different barley genotypes (Arifuzzaman et al., 2014;Reinert et al., 2016;Xue et al., 2019;Oyiga et al., 2020).In the present study, 21 062 GBS-SNP markers were used to examine the barley associationmapping panel and identify MTAs for the 23 root and three shoot parameters.Among them, the significant GBS-SNPs were identified for 17 root and two shoot parameters (Table 3; Supplementary Table 3); most of these parameters were related to root length and biomass as well as shoot height, which agrees with the report by Mora et al. (2016).

Chromosome
The root system of a crop plant is quite complex (anatomically, physiologically, and genetically).To complement the increasing need for the knowledge on these aspects of the root system, the current study explored the population structure, genetic relatedness using complex clustering methods, and association of genomic regions with root-related traits in 191 barley genotypes.Previously, Munoz-Amatriaın et al. ( 2014) identified five subpopulations in 2417 accessions in the barley core collection using SNP markers.Fan et al. (2016) used 408 DArT markers to detect six subpopulations in 206 barley accessions in a QTL study of salinity stress.Hamblin et al. (2010) found seven distinct populations among 1816 barley accessions from the USA using 1416 SNP markers.Similarly, in the present study, the structure analysis separated the barley association panel into six subgroups (K=6) (Figure 1; Supplementary Figure 2), which is the same number as in the previous studies (e.g.Fan et al., 2016).Multiple subgroups identified in this panel are not unexpected because the panel consists of barley accessions with diverse origins.The most accessions in the panel are semi-advanced and parental materials from the InterGrain Pty Ltd breeding program that breeds barley cultivars adapted to diverse environments throughout Australia.However, the parental accessions in the panel included also advanced breeding lines from the former Department of Agriculture and Food program in Queensland (Australia) and commercial varieties from the rest of Australia, as well as varieties and advanced breeding lines from Europe and Ethiopia.
In the current study, the methods such as population structure, kinship and clustering analyses were employed to get insight into the stratification of the barley collection.We corrected for population structure to minimize residual inflation of the test statistics due to unidentified population structure effects (cf.Cockram et al., 2008).Based on the phylogenetic relationships determined by the kinship matrix, the barley panel used in the present study formed two main clades with 10 different subgroups (Supplementary Figure 3).All the genotypes were separated into multi-line clusters based on their origins and breeding history, with the exception of IG15RT_0147 (an Ethiopian landrace), which is consistent with prior research (Igartua et al., 2013).Hence, the dendrogram clades were a good representation of genetic relatedness among the barley lines based on the 17 selected root traits.In comparison, Dai et al. (2012) grouped a set of 185 Tibetan barley accessions into two main clades and eight subclasses using 1307 DArTs, whereas Close et al. (2009) identified six distinct UPGMA groups from 37 barley accessions using 1301 SNPs.The differences in UPGMA clustering may be due to the different sources of plant populations and the marker systems used in different studies.
Previous studies have demonstrated the influence of a statistical model used on the significance of MTAs (Comadran et al., 2011;Pasam et al., 2012).By updating the statistical techniques, false positives may be reduced.The mixed linear model (MLM) includes population structure and kinship that may inflate p values in GWAS, and may also eliminate signals from the known genes that are present as background noise (Yu et al., 2006).Hence, a new statistical method (BLINK) was developed, combining fixed effect model (FEM), Bayesian information criterion, and linkage disequilibrium information to solve the problem of testing multiple loci in MLM.We used BLINK in the present study.The BLINK model accounts for heterogeneity of genetic background and decreases false positive associations (Yu et al., 2006).In our study, the FDR cut-off value used was 0.05, which is stringent and expected to get false positive results in only 5% of analyses, making it more reliable than the cut-off value of 0.1.
LD decay information is a key factor for determining the minimum density of markers required in association mapping and marker-assisted selection.Our LD analysis identified the most responsive genomic regions harboring significant MTAs for root traits (i.e.RA, RB, RD, TRL, and RV), with the most significant MTAs for root diameter (RD) and total root length (TRL).Our LD analysis identified 2.97 Mb as the average LD decay length, which is slightly shorter than 3.5 Mb using 350 barley accessions reported by Mwando et al. (2020).In comparison, Karunarathne et al. (2020) observed slower LD decay (10.6 Mb) than in the present study by employing 282 barley accessions.Furthermore, Comadran et al. (2011) examined the whole-genome pattern with a fast decay of LD in 190 elite barley germplasm (North and West European and North American) by using 4596 SNP markers, out of which 2132 had minor allele frequencies higher than 0.1 in association mapping, and 91.2% of these 2132 SNPs were mapped within 10 cM of their original genetic map positions, confirming the power of GWAS.In another study, Reinert et al. (2016) employed 5892 SNP markers with the proportion of the phenotypic variation explained (marker r 2 ) decreasing from 0.17 to<0.1 in chromosome 7H and being<0.1 for chromosomes 1H to 6H.In comparison, we report here the average r 2 of 0.1 across all strongly associated LD sites (Supplementary Figure 4), indicating that GBS-SNPs used in the current study were sufficiently robust for identifying significant MTAs (cf.Zhou, 2011;Naz et al., 2014;Mahalingam et al., 2020).Cai et al. (2013) found that the r 2 value ≥0.1 in LD decay may provide a significant output and have a large effect on barley root traits in GWAS.In the current study, we found the high LD extending along each chromosomal position because the mixed model significantly reduced the long range and background LD, which might happen due to the intra-and inter-chromosomal SNP-SNP interactions (Close et al., 2009).Furthermore, the marker pairs with strong LD patterns over long distances observed in the current study may be useful for understanding the genotypic selection, recombination and their breeding history (cf.Munoz-Amatriaın et al., 2014), reflecting the non-static nature of LD that is influenced not only by physical or genetic distances, but also by various other factors such as genetic admixture (McVean, 2006;Comadran et al., 2011).
In the current study, the most significant MTAs were found for root length, surface area, biomass, diameter, and volume (Figure 5), which was in agreement with the published reports (Comadran et al., 2011;George et al., 2014).Wu et al. (2015) found the strongest MTA value for barley root length on chromosome 2H, which was only slightly lower than our value for total root length (TRL, Figure 5).In another study, Reinert et al. (2016) found strong MTAs for root length on chromosome 5H (genetic position of 93.20 cM) using the same threshold level of p-value (0.001) as we used in the present study.The physical to genetic distance on barley chromosomes was 1-10 Mb per cM in distal regions, but it could be 100-500 Mb size in the pericentromeric region (Ariyadasa et al., 2014).In our study, some of the identified QTL regions were mapped in a larger marker interval; it was due to low coverage of SNP markers in those specific regions.
We identifi ed 40 QTL regions within 271 MTAs (Supplementary Table 3) for different root and shoot traits with seven genomic regions influencing multiple root or shoot traits in the barley association panel.Most of these MTAs were co-localized at the putative QTL regions on chromosomes 2H, 5H, and 7H.Similarly, Robinson et al. (2016) identified QTL on chromosome 5H for root number in barley, which was similar to the genomic region we detected for root length traits.Arifuzzaman et al. (2014) detected three novel QTLs (QRl.S42.2H,QRl.S42.3H, and QRl.S42.5H) for root length in barley on 2H (41.1 cM), 3H (118.72 cM), and 5H (125.1 cM) linkage groups using microsatellite (SSR) and DArT markers; these QTL regions were also significant for the root length and the related traits found in the present study (Supplementary Table 3).Reinert et al. (2016) detected 17 putative QTLs related to five key root and shoot traits of barley.Jia et al. (2019) identified 65 MTAs for root architectural architecture traits; thus, further fine mapping of the QTL will narrow down the QTL region and shorten the candidate list to determine the final causal gene(s) for the QTL.It is tempting to speculate that candidate genes identified in that particular region of 7H (i.e.HORVU7Hr1G025180 and HORVU7Hr1G009640) with functional annotation of glyceraldehyde-3-phosphate dehydrogenase C2 and auxin response factor 19, respectively (Abdel-Ghani et al., 2019), may be influencing barley root architecture.Another important QTL with 62 MTAs associated with five different parameters (DCL_thick, TRL, RL_40, SRL and RB) was located in the 13.2 Mb interval (714.78-727.98Mb) on chromosome 2H; this QTL may be important for root architecture and can be used in breeding for root system improvement.

Conclusions
We identified 1199 significant MTAs for 17 root and two shoot traits in barley germplasm.The 37 significant QTLs for these traits were detected in the association-mapping panel, with three QTLs for the shoot and 34 for the root traits.Root diameter and length had overlapped with the QTLs for root surface area and volume, and thus appeared to be the most promising root traits for markerassisted selection and breeding deep-rooting barley varieties for environmental adaptation (especially in drying soils).The putative QTLs for the important root traits and their associated markers may be useful genomic resources for marker-assisted selection to introgress these root trait QTLs into new cultivars to improve their adaptation to specific environments.Further validation of the significant markers should rely on larger populations to increase the power of genetic association studies.

FIGURE 2
FIGURE 2 Estimated principal components (PCs) explaining the percentage of variation in 191 barley genotypes by utilizing GBS-SNP data.

FIGURE 3
FIGURE 3Distribution of SNP markers (produced after quality filtering) based on the physical map of the barley genome.The marker density is shown in the color legend on the right.

TABLE 1
Root phenotypic parameters and their codes.

TABLE 2
Selected single nucleotide polymorphism (SNP) markers showing the strongest significant marker-trait associations with various root and shoot traits in the barley association panel.

TABLE 2 Continued
*The trait codes are listed in Table1.FIGURE 1An estimate of the most probable number of clusters based on DK in five iteration runs using GBS-SNP markers in the barley association panel.

TABLE 3 Continued
. The trait abbreviations are listed in Table1.The numbers in paratheses are total number of QTL for the trait identified in this study, if more than one.b.The numbers outside parentheses (if any) are the total numbers of trait-associated significant SNPs including these within 100 bp.The numbers in parenthesis are the numbers of the traitassociated significant SNPs that are at least 100 bp apart.c.The same numbers represent the QTL for different traits overlapping in the same genomic region. a