Identification and Confirmation of Loci Associated With Canopy Wilting in Soybean Using Genome-Wide Association Mapping

Drought causes significant soybean [Glycine max (L.) Merr.] yield losses each year in rain-fed production systems of many regions. Genetic improvement of soybean for drought tolerance is a cost-effective approach to stabilize yield under rain-fed management. The objectives of this study were to confirm previously reported soybean loci and to identify novel loci associated with canopy wilting (CW) using a panel of 200 diverse maturity group (MG) IV accessions. These 200 accessions along with six checks were planted at six site-years using an augmented incomplete block design with three replications under irrigated and rain-fed treatments. Association mapping, using 34,680 single nucleotide polymorphisms (SNPs), identified 188 significant SNPs associated with CW that likely tagged 152 loci. This includes 87 SNPs coincident with previous studies that likely tagged 68 loci and 101 novel SNPs that likely tagged 84 loci. We also determined the ability of genomic estimated breeding values (GEBVs) from previous research studies to predict CW in different genotypes and environments. A positive relationship (P ≤ 0.05;0.37 ≤ r ≤ 0.5) was found between observed CW and GEBVs. In the vicinity of 188 significant SNPs, 183 candidate genes were identified for both coincident SNPs and novel SNPs. Among these 183 candidate genes, 57 SNPs were present within genes coding for proteins with biological functions involved in plant stress responses. These genes may be directly or indirectly associated with transpiration or water conservation. The confirmed genomic regions may be an important resource for pyramiding favorable alleles and, as candidates for genomic selection, enhancing soybean drought tolerance.


INTRODUCTION
Among the various abiotic stresses to which soybean [Glycine max (L.) Merr.] is exposed, drought causes the most severe yield losses and greatest year to year variation for rain-fed production systems throughout soybean-growing regions (Oya et al., 2004). Between 1986 and 2020, the soybean production area in the United States impacted by drought ranged between 3 and 59% (https://www.ncdc.noaa.gov/monitoring-content/ societal-impacts/cmsi/562.tot.out), and there were 11 years in which the proportion of the soybean production area impacted by drought exceeded 20%. Total estimated economic losses due to drought during this same time period (adjusted to the consumer price index) were $217 billion in the United States (https://www.ncdc.noaa.gov/billions/events /US/1980/US/ -2020. It is likely that climate change will exacerbate the unpredictability of rainfall and will lead to an increased frequency of drought and flooding in the future (Douglas et al., 2008). Genetic improvement of soybean for drought tolerance is a cost-effective approach to stabilize yield under rain-fed production.
Past efforts to improve soybean drought tolerance through breeding have not taken full advantage of the potential genetic diversity available in germplasm collection (Frankel, 1984;Upadhyaya and Ortiz, 2001) nor have they taken direct advantage of the current understanding of physiological traits associated with drought tolerance (Sinclair et al., 2004;Sinclair and Purcell, 2005). Often, soybean breeders have focused on elite germplasm and restricted crosses to only include high-yielding elite lines, essentially "reshuffling" the same genes (Carter et al., 2004). As a result, less agronomically favorable genotypes with potential tolerance to drought have not been included, and potential progress has been inherently limited because of a lack of genetic diversity. Breeding efforts that target specific physiological traits that have agronomic advantages at the field level offer an alternative approach that draws upon previously under-utilized, diverse genetic resources (Sinclair et al., 2004;Tuberosa and Salvi, 2006).
Slow canopy wilting (CW) in soybean is a promising trait for crop improvement. Carter et al. (1999Carter et al. ( , 2006 screened exotic germplasm for drought tolerance in North Carolina and identified multiple slow-wilting genotypes, namely, PI 416937 and PI 471938. "USDA-N8002" is a soybean cultivar derived from PI 471938 (25% pedigree) and PI 416937 (12.5% pedigree), which is slow wilting and had yields averaging 7% greater than the cultivar check across 74 environments in the southern United States . More recently, several new genotypes that wilt more slowly than previously discovered genotypes have been identified (Kaler et al., 2017a;Steketee et al., 2020).
Slow wilting is associated with the conservation of soil moisture when soil moisture is plentiful, which can then be used when soil moisture in fast-wilting genotypes has been depleted Ries et al., 2012). The conservation of soil water for slow wilting in several genotypes appears to be associated with decreased hydraulic conductance under high vapor pressure deficit, resulting in decreased transpiration and improved wateruse efficiency relative to fast-wilting genotypes (Fletcher et al., 2007;Sinclair et al., 2008;Sadok and Sinclair, 2009;Devi and Sinclair, 2013).
Canopy wilting is a complex quantitative trait controlled by many genetic loci (Charlson et al., 2009;Du et al., 2009;Abdel Haleem et al., 2012;Hwang et al., 2015Hwang et al., , 2016Kaler et al., 2017a;Steketee et al., 2020). Hwang et al. (2015) used the results from five biparental mapping populations to identify clusters of eight quantitative trait loci (QTLs) for CW that were present in at least two populations, and a meta-analysis of these eight clusters identified nine meta-QTLs in eight chromosomal regions (Hwang et al., 2016). Association mapping of soybean CW identified 61 SNPs in a panel of 373 maturity group (MG) IV accessions (Kaler et al., 2017a) and 45 SNPs in a panel of 162 MG VI-VIII accessions (Steketee et al., 2020). Between the results of these two association-mapping studies for CW, similar genetic loci regions were identified on Gm01, Gm04, Gm06, Gm09, Gm12, Gm15, Gm18, Gm19, and Gm20. These two association mapping studies identified loci on Gm02 that were coincident with a meta-QTL identified previously (Hwang et al., 2016).
The objectives of this study were to confirm the slow-wilting loci identified previously by association mapping (Kaler et al., 2017a;Steketee et al., 2020) and linkage mapping (Charlson et al., 2009;Abdel Haleem et al., 2012;Hwang et al., 2016) and to identify additional novel loci associated with CW using a new panel of 200 diverse soybean accessions. We also considered the association of slow-wilting loci with loci associated with other drought-tolerant traits, such as carbon isotope (C13) ratio (as a measure of water use efficiency) (Kaler et al., 2017b;Bazzer et al., 2020a,b), oxygen isotope (O18) ratio (as a measure of transpiration) (Kaler et al., 2017b), and canopy temperature (Kaler et al., 2018;Bazzer and Purcell, 2020). An additional objective was to determine the ability of genomic estimated breeding values (GEBVs) from previous research studies to identify new slow-wilting genotypes from the United States Department of Agriculture (USDA) germplasm collection.

Field Experiments
In 2018, 200 MG IV accessions from the United States Department of Agriculture-Germplasm Resources Information Network (USDA-GRIN) germplasm collection (https://npgsweb. ars-grin.gov/) were selected for phenotypic evaluation of CW. Of the 200 accessions in this new panel, 100 represented the most genetically diverse genotypes (based on molecular marker data) from the original 373-accession panel used by Kaler et al. (2017a). Additionally, 100 new diverse accessions were selected from the USDA-GRIN collection based on extreme breeding values (BVs) calculated from previous association mapping results (Dhanapal et al., 2015a;Kaler et al., 2017aKaler et al., ,b, 2018. Allelic effects from these previous studies were then used to calculate BVs for each of the MG IV accessions in the USDA germplasm collection. We selected from the germplasm collection 10 accessions with the highest and 10 accessions with the lowest BVs for CW (Kaler et al., 2017a), canopy temperature (Kaler et al., 2018), C13 ratio (Kaler et al., 2017b), and a fraction of nitrogen derived from N 2 fixation (Dhanapal et al., 2015a). Additional 10 accessions were selected that had either high or low BVs for the combination of all four traits.
Soil test analyses were conducted, and P and K were applied as recommended at all site-years. In the PT and RH locations, 9 row plots were sown with a drill having 19 cm between rows and with a plot length of 4.57 m. The plots consisted of four rows at CO with rows 3.96 m in length and with 0.15 m between rows. At MC, there were three-row plots with rows 4.87 m in length and with 0.19 m row spacing. Herbicides and insecticides were applied as recommended to control weeds and insects at all siteyears. For the IR treatment, drip irrigation was used at CO, flood irrigation was used at PT, and furrow irrigation was used at RH and MC. At PT and RH, irrigation was applied to IR and DR treatments before the V6 stage when the estimated soil moisture deficit exceeded 50 mm (Purcell et al., 2007). After V6, no further irrigation was applied to the DR treatment.

Phenotypic Evaluations and Statistical Analysis
Canopy wilting was rated based on a visual scoring scale where 0 represented no wilting, 20 represented slight wilting and leaf rolling at the top of the canopy, 40 represented severe leaf rolling at the top of the canopy and moderate leaf wilting throughout the canopy and loss of petiole turgidity, 60 represented severe wilting throughout the canopy and loss of petiole turgidity, 80 represented severe petiole wilting and dead leaves scattered throughout the canopy, and 100 represented plant death Kaler et al., 2017a). After removing the RH19 and CO19 data, there were six site-years and two treatments. We subsequently refer to the combination of site-year and treatment as an environment. These 12 environments were designated for 2018 as follows: IR (PT18IR) and DR (PT18DR) at Pine Tree, AR; IR (RH18IR) and DR (RH18DR) at Rohwer, AR; IR (CO18IR) and DR (CO18DR) at Columbia, MO; and IR (MC18IR) and DR (MC18DR) at Maricopa, AZ. For 2019, there was a similar naming convention for IR (PT19IR) and DR (PT19DR) at Pine Tree, AR; and IR (MC19IR) and DR (MC19DR) at Maricopa, AZ.
Canopy wilting was rated four times at CO18 and MC19, two times at MC18, and one time at RH18, PT18, and PT19.
For all the environments, measurements were performed within 2 h of solar noon under a clear sky. For all rating dates, plant development ranged from late vegetative stages to R4 (Fehr and Caviness, 1977). Between emergence and the last rating date for the DR treatment, irrigation was applied three times at RH18, one time at PT18, zero time at PT19, four times at CO18, and three times at MC18 and MC19 before rating wilting. At RH18, PT18, PT19, and CO18, there was minimal stress on rating dates. The soil had been replenished with rainfall 3 days prior to rating wilting at RH18, 6 days at PT18, 6 days at PT19, and from 2 to 7 days for the four CO18 wilting ratings. At MC18, there was no rainfall during the measurement period, and CW scores were recorded 1 day after the irrigation in the IR treatment. After the initial rating at MC18, the DR treatment was irrigated, and CW scores were recorded after 14 and 21 days. At MC19, the DR treatment was irrigated and wilting was rated after 18, 21, 27, and 31 days, while the IR treatment was rated 1 day after irrigation (Supplementary Table 1). Cumulative potential evapotranspiration rate was calculated to quantify soil moisture deficit for the DR treatment for each site-year (Purcell et al., 2007). At the time of rating for all the environments, there was visual evidence of wilting among some genotypes in the IR treatment, and therefore, both the IR and DR treatments were scored.
Canopy wilting was measured multiple times at CO18, MC18, and MC19 (Supplementary Table 1), and the average values from individual rating dates were used for genomewide association (GWAS) analysis. With the exception of MC19IR, individual ratings within a site-year and treatment were significantly correlated (P ≤ 0.001) with correlation coefficients ranging between 0.43 and 0.77 (data not shown). Individual rating values also agreed closely (0.72 ≤ r ≤ 0.91) with the average rating for a given site-year-treatment combination. Previous reports of CW have also found similar ranking among genotypes and high correlations between individual rating dates Steketee et al., 2020), and previous mapping studies of CW have used both individual rating dates or average values from multiple rating dates (Charlson et al., 2009;Abdel Haleem et al., 2012;Hwang et al., 2015;Kaler et al., 2017a). Descriptive statistics and Pearson's correlation coefficients were computed using the PROC UNIVARIATE and PROC CORR procedures (α = 0.05) of SAS version 9.4 (SAS, Institute 2013), respectively.
For ANOVA, the PROC MIXED procedure (α = 0.05) of SAS 9.4 was used with a model: Y ijklm = µ + G i + S j + T k + GS ij + GT ik + ST jk + GST ijk + R l(j) + B m(l) + GR il(j) + GB im(l) + TR kl(j) + TB km(l) + GTR ikl(j) + (residual error εijklm ]. In this model, fixed effects were G i = effect of the ith genotype, S j = effect of the jth site year, and T k = effect of the kth treatment, plus all of the fixed effect two-way interactions (GS ij , GT ik , and ST jk ), and the three-way interaction GST ijk . The random effects included the following: R l(j) = effect of the lth replicate nested in site-year j, B m(l) = effect of the mth incomplete block nested in rep l, GR il(j) = effect of the interaction of the ith genotype with the lth replicate, GB im(l) = effect of the interaction of the ith genotype with the mth incomplete block, TR kl(j) = effect of the interaction of the kth treatment with the lth replicate, TB km(l) = effect of the interaction of the kth treatment with the mth incomplete block, GTR ikl(j) = effect of the interaction of the ith genotype, kth treatment, and lth replicate, and the residual error consists of the interaction of the ith genotype, the kth treatment, and the mth incomplete block.
PROC VARCOMP of SAS 9.4 with the restricted maximum likelihood (REML) method was used to estimate the variance components for the calculation of broad-sense heritability on an entry-mean basis. The best linear unbiased prediction (BLUP) values for each independent environment, as well as across environments, were estimated using META-R, and these values were then used for association analysis. Association analysis was conducted in three ways: (1) for each of the 12 environments, (2) averaged over site-years for IR (Ave_IR) and DR treatments (Ave_DR), and (3)

Genotyping and Linkage Disequilibrium
Marker data, consisting of 42,450 SNPs for all 200 accessions, were obtained from Soybase (Glyma.w82.a1,www.soybase.org) (Song et al., 2013(Song et al., , 2015. The Glyma.w82a1genome assembly was used, because these same markers were used in the previous association mapping of CW (Kaler et al., 2017a), and one objective was to confirm the previously identified markers. Marker data of 34,680 SNPs were filtered for quality control, which included removing monomorphic markers, heterozygous markers, markers with minor allele frequency (MAF) ≤ 5%, and markers with a missing rate higher than 10%. The remaining missing markers (those with ≤10%) were imputed using an LD-kNNi method, which is based on a k-nearest neighbor genotype (Money et al., 2015). These markers were used to measure pairwise linkage disequilibrium (LD) separately for euchromatic and heterochromatic regions based on squared correlation coefficients (r 2 ) of alleles in the TASSEL 5.0 software (Hill and Weir, 1988;Bradbury et al., 2007). The results indicated that at r 2 = 0.25, an average LD across all chromosomes decayed at an average of 175 kb in the euchromatic region and at an average of 5,100 kb in the heterochromatic region. These results on LD in soybean are consistent with previous studies (Schmutz et al., 2010;Hwang et al., 2014;Dhanapal et al., 2015b;Kaler et al., 2017a).

Genome-Wide Association Analysis
The FarmCPU model was chosen as the most appropriate model to control false positives and false negatives (Liu et al., 2016;Kaler et al., 2017aKaler et al., , 2020b. A significant threshold value (-Log10 P ≥ 3.5), which is equivalent to P ≤ 0.0003, was used to identify SNPs. This threshold P-value was chosen based on a formula that uses marker-based heritability (Kaler and Purcell, 2019) and is similar to threshold values used previously (Kaler et al., 2017aSteketee et al., 2020;Kaler and Purcell, 2021). To identify the common significant SNP present in more than one environment, a threshold value of P ≤ 0.05 was used but only if the representative SNP had an association of P ≤ 0.0003 in a second environment (Kaler et al., 2017a(Kaler et al., ,b, 2018. The allelic effect of a significant SNP was calculated by taking the difference in mean CW between genotypes with the major allele and those with minor allele. Alleles from either the major or minor class were considered as favorable if they were associated with reduced CW. A negative sign in the allelic effect indicated that the major allele was favorable for CW, and a positive sign in the allelic effect indicated that the minor allele was favorable for CW. To find coincident SNPs or overlap of loci identified in this research study with those loci reported in previous studies (Charlson et al., 2009;Abdel Haleem et al., 2012;Hwang et al., 2016;Kaler et al., 2017aKaler et al., ,b, 2018Bazzer and Purcell, 2020;Bazzer et al., 2020a,b;Steketee et al., 2020), we used Bedtools Intersect Intervals Tool (Quinlan and Hall, 2010) in Galaxy with an overlapping window of ± 175 kb. This window size was chosen because the average LD across all chromosomes decayed at an average of 175 kb in the euchromatic region. SNPs that were not coincident with previous studies were considered as novel loci.

Genomic Estimated Breeding Values (GEBVs) and Prediction Accuracy
We evaluated the accuracy of predicting CW by correlating (PROC CORR, SAS v. 9.4, SAS Institute, 2013) observed wilting scores using three different datasets with GEBVs (Meuwissen et al., 2001) from the BayesB genomic prediction model (Pérez et al., 2010). In the first scenario, the averaged CW scores of 373 (Kaler et al., 2017a) and 153 (Steketee et al., 2020) accessions were used as the training set for the genomic prediction of the 100 new accessions used in this study. In the second scenario, the averaged CW scores of the 100 new accessions used in this study and the 153 accessions reported by Steketee et al. (2020) were used as the training set for the genomic prediction of the 373 accessions reported by Kaler et al. (2017a). In the third scenario, the averaged CW scores of the 100 new accessions used in this study plus the 373 accessions reported by Kaler et al. (2017a) were used as the training set for the genomic prediction of the 153 accessions reported by Steketee et al. (2020).
Marker data consisted of 34,652 filtered SNPs for all accessions that were obtained from Soybase (Glyma.w82.a1,www.soybase.org) (Song et al., 2013(Song et al., , 2015. For the 162 accessions reported by Steketee et al. (2020), genotyping data from Soybase were available for only 153 accessions. Imputation and filtration were accomplished using TASSEL as described in the genotyping and LD sections.

Predicting Canopy Wilting for Soybean Germplasm Using GEBVs
The 19,648 accessions in the USDA soybean collection (https:// www.ars-grin.gov/), consisting of MGs from MG000 to MGX, were used as a testing population. The 373 accessions from Kaler et al. (2017a), 153 accessions from Steketee et al. (2020), and 100 new lines from this study were used as a training population to predict the CW from the soybean germplasm using the BayesB genomic prediction model (Pérez et al., 2010). The 10 slowest and 10 fastest wilting genotypes from each MG were identified based on GEBVs from the testing population. Genotyping data for all germplasm accessions were obtained Frontiers in Plant Science | www.frontiersin.org from Soybase (Glyma.w82.a1, www.soybase.org) (Song et al., 2013(Song et al., , 2015.

Candidate Gene Identification
Significant SNPs were used to identify candidate genes for CW using the G. max genome assembly version Glyma.Wm82.a1.v1.1 (www.soybase.org) (Schmutz et al., 2010). Genes located near SNPs associated with CW were considered as potential candidates if they were within ± 10 kb or ± 100 kb of a significant SNP in euchromatic and in heterochromatic regions, respectively. These distances were chosen to reflect the average distance between SNPs in these regions. Candidate genes were grouped into three gene ontology (GO) categories, namely, biological process, cellular component, and molecular function. Further, based on biological functions, genes were identified and categorized if they had any association with drought tolerancerelated responses, such as abscisic acid, water transport, root development, leaf senescence, jasmonic acid, heat acclimation, stomata, and salicylic acid (Schulze, 1986;Jackson et al., 2000;Schmutz et al., 2010;Jarzyniak and Jasinski, 2014;Khan et al., 2015;Sah et al., 2016).

Phenotype Descriptions
There were large differences among site-years between emergence and the last rating date in average maximum and minimum temperatures and total precipitation (Supplementary Table 1). Average maximum temperature was highest for MC18 (40 • C) and lowest for CO18 (30 • C), whereas average minimum temperature was highest for RH18 (23 • C) and lowest for CO18 (18 • C). Total precipitation between emergence and the last rating date was highest for PT19 (376 mm) and lowest for MC18 (5 mm). To quantify drought for each environment, we estimated the cumulative potential evapotranspiration rate (Purcell et al., 2007), which was highest in MC18 (318 mm) and MC19 (385 mm) (Supplementary Table 1).
There was a broad range of CW observed within a single environment, when averaged across IR or DR treatments, and when averaged across all the 12 environments ( Table 1). Within the IR treatment, CW scores had ranges of 35 (CO18IR), 17.5 (MC18IR), 3.3 (MC19IR), 40 (PT18IR), 35 (PT19IR), 30 (RH18IR), and 40 (Ave_IR) ( Table 1). Averaged over DR treatments, the range of wilting values was greater for the IR treatments by 22.5. For PT18, the average wilting score for the IR treatment was numerically greater (20.4) than that of the DR treatment (18.8), but the median values were the same (20), indicating that wilting scores between treatments were essentially the same. Although soil moisture was plentiful at RH, PT, and CO for both IR and DR treatments on measurement dates in 2018 and 2019, the IR treatment received irrigation earlier in the season but the DR treatment did not, which may have resulted in differential responses. Figure 1 shows the frequency distribution of the average genotypic means of CW for the IR and DR treatments, indicating that the DR treatment had a wider range of CW compared with the IR treatment. On one extreme, there were 11 genotypes for the average IR treatments and 18 genotypes for the average DR treatments that had wilting scores significantly (P ≤ 0.05) lower than those of the two slow-wilting checks, namely, PI416937 and PI471938. At the other extreme, there were three genotypes for the IR treatment and four genotypes for the DR treatment with wilting scores significantly (P ≤ 0.05) higher than those of the two fast-wilting checks, namely, A5959 and 08705_16 (Supplementary Table 3).
ANOVA of CW data indicated that there were significant effects for the fixed effect terms of genotype, site-year, treatment (IR and DR), and all two-way and three-way interactions

Effect
Degree of Freedom F-statistic P-value Only fixed effects are shown.
Out of 87 significant coincident SNPs that we confirmed from previous studies (Table 3), 38 SNPs that likely tagged 25 loci were from the IR treatment in individual environments, 31 SNPs that likely tagged 26 loci were from the DR treatment in individual environments, six SNPs that likely tagged five loci were based on the IR treatment averaged over site-years, eight SNPs that likely tagged eight loci were based on the DR treatment averaged over site-years, and four SNPs that likely tagged four loci were from values averaged across all the environments. Of the 38 significant SNPs from the IR treatment, 31 were present in at least two environments. Of the 31 significant SNPs from the DR treatment, 23 were present in at least two environments.
Four significant SNPs (ss715606242, ss715611329, ss715612746, and ss715632103) on chromosomes Gm10, Gm11, Gm12, and Gm18, respectively, were common between IR and DR treatments, averaged IR and DR treatments, and averaged across all environments ( Table 3). Two genomic regions had the exact same markers and positions that were identified by Kaler et al. (2017a). These two SNPs were found on Gm08 (ss715599784) and Gm18 (ss715632103) and had large allelic effects between −2.4 and −5.4 (highlighted area in Table 3). The allelic effect ranged from −5.1 to 2.3 for the 38 SNPs identified for the IR treatment among environments, −9 to 4.6 for the 31 SNPs identified for the DR treatment among environments, −4 to 1.1 for the six SNPs found when averaged over irrigated site-years, −4.7 to 1.7 for the eight SNPs found when averaged over DR site-years, and −5 to 1.1 for the four SNPs found when averaged across all environments.
The SNPs that were not coincident with previous studies were considered novel loci (

Genomic Estimated Breeding Values (GEBVs) and Prediction Accuracy
Averaged CW scores of the 373 accessions from Kaler et al. (2017a) combined with the 153 accessions from Steketee et al. (2020) were used as a training set for genomic prediction of the 100 new accessions used in this study. For three of the six IR site-years, there was a significant positive correlation (P ≤ 0.05) between GEBVs and observed CW that ranged from r = 0.26 (PT18IR) to r = 0.49 (PT19IR) ( Table 5). For the DR site-years, four of the six DR site-years had positive significant correlations between GEBVs and observed wilting that ranged from r = 0.2 (RH18DR) to r = 0.5 (MC18DR). GEBVs averaged across the IR (r = 0.45) and DR treatments (r = 0.43) and averaged across all environments (r= 0.45) also showed significant positive correlations. In a second scenario, averaged CW scores of the 100 new accessions from this study combined with the 153 accessions from Steketee et al. (2020) were used as a training set for genomic prediction of the 373 accessions reported by Kaler et al. (2017a). There were significant positive correlations (P ≤ 0.05) between observed CW and GEBVs for individual environments that ranged from r = 0.2 (Pine Tree 16) to r = 0.39 (Salina 16), and when averaged across environments (r = 0.37) ( Table 6).
In a third scenario, averaged CW scores of the 100 new accessions from this study combined with the 373 accessions reported by Kaler et al. (2017a) were used as a training set for genomic prediction of the 153 accessions reported by Steketee et al. (2020). There were significant positive correlations (P ≤ 0.05) between observed CW and GEBVs for individual environments that ranged from r = 0.35 (Salina 15) to r = 0.46 (Salina 16), and when averaged across environments (r = 0.5) ( Table 7).   FIGURE 2 | Location of SNPs significantly associated with CW in 12 environments, averaged over site-years for IR treatment (Ave_IR), averaged over site-years for drought treatment (Ave_DR), and averaged across all environments (AAE). Locations of SNPs associated with CW from the current research study were compared with SNPs previously identified with CW, canopy temperature (CT), C13, and O18 ratios. Details about the coincident SNPs are described in Table 3.

Predicting Canopy Wilting for Soybean Germplasm Using GEBVs
Averaged CW scores of the 100 new accessions from this study combined with the 373 accessions reported by Kaler et al. (2017a) and the 153 accessions reported by Steketee et al. (2020) were used as a training set for the genomic prediction of CW for the 19,648 soybean accessions reported by Song et al. (2015). A wide range of predicted CW scores from <15 to more than 31 was observed among the accessions (Supplementary Figure 4). For each MG, the 10 genotypes with the lowest predicted scores  Kaler et al. (2017a) and Steketee et al. (2020), respectively. *Significant at P = <0.05.  Kaler et al. (2017a) was determined from the averaged CW scores of the 100 new accessions of the current study and of the 153 accessions reported by Steketee et al. (2020). *Significant at P = <0.05.  Steketee et al. (2020) was determined from the averaged CW scores of the 100 new accessions of the current study and of the 373 accessions reported by Kaler et al. (2017a). *Significant at the P = <0.05. and the 10 genotypes with the highest predicted scores are presented in Supplementary Table 5. GEBVs for the slowest wilting genotypes among MGs ranged from 9 to 14, and GEBVs for the fastest wilting genotypes among MGs ranged from 16 to 33. The MG with the greatest range in GEBVs for wilting was MG VI (9 to 33), and the MG with the least range in GEBVs for wilting was MG X (9 to 22) (data not shown).

Candidate Gene Identification
Of the 87 coincident SNPs and the 101 novel SNPs associated with CW in this study, 87 genes from the coincident SNPs and 96 genes from the novel SNPs were identified within ±10 kb in the euchromatic region and ±100 kb in the heterochromatic region using the G. max genome assembly version Glyma.Wm82.a1.v1.1 in SoyBase (www.soybase.org) (Schmutz et al., 2010). The annotations of the biological processes, molecular functions, and cellular components of these genes are reported in Supplementary Table 6 for coincident SNPs and Supplementary Table 7 for novel SNPs. Based on biological functions, several genes were associated with droughtrelated responses, such as abscisic acid, water, root, leaf senescence, jasmonic acid, heat acclimation, stomata, and salicylic acid (Schulze, 1986;Jackson et al., 2000;Schmutz et al., 2010;Jarzyniak and Jasinski, 2014;Khan et al., 2015;Sah et al., 2016).

DISCUSSION
This study was conducted to confirm loci previously reported and identify novel loci associated with CW by association mapping. There was wide phenotypic variation in CW, which is important for dissecting complex traits through association mapping (McCarthy et al., 2008). In comparison with slow-wilting checks, on one extreme, PI407927B had significantly lower (P < 0.05) CW scores under both IR and DR treatments. At the other extreme, PI507407 and PI507408 had wilting scores significantly (P < 0.05) greater than those of fast-wilting checks under both the IR and DR treatments (Supplementary Table 3). We also predicted slower and faster wilting accessions from the USDA Soybean Germplasm Collection (Supplementary Table 5) using GEBVs. These slow-wilting genotypes represent new genetic resources for providing breeders with favorable slowwilting alleles. This study showed significant (P < 0.001) positive correlations (r = 0.8) for CW between the IR and DR treatments, and moderate to high heritability (39% ≤ H 2 ≤ 84%), indicating that CW was relatively stable across the environments. Similar results of correlations and heritability were reported in previous mapping studies for CW (Charlson et al., 2009;Abdel Haleem et al., 2012;Hwang et al., 2015;Kaler et al., 2017a;Steketee et al., 2020).
It is counter-intuitive that wilting was rated under water-replete conditions in the IR treatment. Except for MC19, however, there were highly significant (P ≤ 0.001) correlations between the IR and DR treatments within a site-year and between the Ave_IR and Ave_DR ratings (r = 0.8, Supplementary Table 3). Of the 75 SNPs identified in individual IR environments (Tables 3, 4), 42 of these same SNPs were also found in individual DR environments and 33 were unique to the IR environments. The discovery of wilting QTLs specific for the IR environments may reflect genomic regions that are responsive to the early stages of drought.
Out of the 87 coincident SNPs found in this study, 42 likely tagged 31 loci previously associated with only CW (Charlson et al., 2009;Abdel Haleem et al., 2012;Hwang et al., 2016;Kaler et al., 2017a;Steketee et al., 2020) and 45 likely tagged 37 loci previously identified with other drought-related traits (canopy temperature, and C13 and O18 ratios) ( Table 3 and Figure 2). The genomic regions that were consistent across MGs (MGIV from this study and Kaler et al., 2017aKaler et al., ,b, 2018and MGVI-VIII from Steketee et al., 2020) and across biparental populations, and different environments show particular promise as selection targets for improving CW under stress. In particular, SNP_ID ss715632103 on Gm18 (59162269 bp) and SNP_ID ss715599784 on Gm08 (16250528 bp) were identical to SNPs previously associated with CW (Kaler et al., 2017a). The genomic regions found in common between this and previous mapping studies may be an important resource in genomic selection studies to improve drought tolerance in soybean. Apart from coincident SNPs, this study also identified 101 novel SNPs that tagged 84 loci associated with CW that could be additional resources for the improvement of the CW in soybean.
Genomic selection was originally proposed by Meuwissen et al. (2001), and simulations have demonstrated that it is far more effective and efficient than marker-assisted selection for polygenic traits (Bernardo and Yu, 2007;Jannink et al., 2010). Both simulation and empirical studies have repeatedly shown that genomic selection performs as well as, and frequently better than, phenotypic selection (Wong and Bernardo, 2008;Matei et al., 2018;Voss-Fels et al., 2019). The breeding community has concluded that genomic selection has the potential to decrease overall costs and potentially allow more cycles of selection per unit time, as compared with phenotypic selection (Wong and Bernardo, 2008;Matei et al., 2018;Voss-Fels et al., 2019).
We determined the ability of GEBVs using different scenarios of training and testing populations from this and previous studies (Kaler et al., 2017a;Steketee et al., 2020) to predict CW phenotypes of unknown genotypes. In general, there was significant positive prediction accuracy between observed CW and GEBVs. Although the accuracy of the predictions was somewhat low (average irrigated 0.45 and average drought 0.43; Table 5), the heritability for the traits was relatively high when the Maricopa location was excluded (ranging from 0.62 to 0.86, Table 1), and correlations between locations were relatively high (excluding Maricopa; Supplementary Table 3). Based on these results, we anticipate genomic selection will permit more rapid progress toward the release of soybean cultivars with improved tolerance to water limitation and/or higher water-use efficiency than marker-assisted selection, phenotypic selection, or the most common strategy: evaluating breeding populations only in highyielding, often irrigated, environments.
Out of 188 significant SNPs, 183 candidate genes were identified (87 from coincident SNPs; Supplementary Table 6 and 96 from novel SNPs; Supplementary Table 7) in this study within ±10 kb or ±100kb in euchromatic and heterochromatic regions, respectively, of associated SNPs that had biological functions associated with stress responses or water transport. Among 183 candidate genes identified, 57 SNPs were present within genes that code for proteins having biological functions involved with plant stress responses. These genes may be directly or indirectly associated with transpiration or water conservation. Supplementary Tables 6, 7 provide information on these candidate genes and their associated functions in water transportation, abscisic acid stimulus, and root development (Schmutz et al., 2010).

CONCLUSIONS
This study confirmed 31 slow-wilting loci identified previously by association mapping (Kaler et al., 2017a;Steketee et al., 2020) and linkage mapping (Charlson et al., 2009;Abdel Haleem et al., 2012;Hwang et al., 2016). Similarly, we found 37 CW loci that overlapped with loci for other drought-related traits (C13 ratio, O18 ratio, and canopy temperature). This study also identified 84 novel loci associated with CW using a panel of 200 diverse MG IV soybean accessions. There were 183 candidate genes within ±10 kb (euchromatic region) or ± 100 kb (heterochromatic region) of CW SNPs that were associated with stress responses. GEBVs from this study and previous research studies were used to identify genotypes from all the 13 MGs in the USDA Soybean Germplasm Collection that were extremes for slow or fast wilting. Favorable alleles from confirmed genomic regions and the identification of additional slow-wilting genotypes may be important new resources for improving drought tolerance in soybean.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
SC and AK wrote initial drafts of the manuscript. SC, AK, and JG collected genomic information and analyzed data. LP coordinated and supervised the project. All authors were involved in planning, executing, collecting data from field experiments, and read and approved the final manuscript. Supplementary Figure 4 | Frequency distribution for predicted canopy wilting scores for 19,648 soybean accessions in the USDA Soybean Germplasm Collection for maturity groups (MGs) 000 through X. Canopy wilting scores were predicted using genomic estimated breeding values that used training sets from the current research, Kaler et al. (2017a), and Steketee et al. (2020).
Supplementary Table 1 | Planting dates, wilting rating dates, weather data, number of irrigations, and potential evapotranspiration rate for the drought treatment at six site years. a Cumulative potential evapotranspiration between emergence and last rating date.
Supplementary Table 3 | Accessions identified as wilting slower or faster than slow-wilting (PI416937 and PI471938) and fast-wilting (A5959 and 08705_16) checks when averaged over irrigated (Ave_IR) and drought (Ave_DR) treatments. a Standard deviation values from analysis of variance for Ave_IR and Ave_DR were 6.8 and 8.9, respectively. averaged drought (Ave_DR) treatments, and averaged across all environments (AAE).

Supplementary
Supplementary Table 5 | Predicted slowest and fastest wilting accessions of maturity group (MG) 000 through X using genomic estimated breeding values (GEBVs) from the USDA Soybean Germplasm Collection (Song et al., 2015). The training set for the prediction of 19,648 accessions was determined using the averaged canopy wilting (CW) scores of 100 new accessions in the present study, 373 accessions reported by Kaler et al. (2017a), and of the 153 accessions reported by the Steketee et al. (2020).
Supplementary Table 6 | List of significant coincident SNPs associated with canopy wilting and their potential candidate genes based on 87 identified SNPs from twelve environments for the irrigated and drought treatments, averaged over irrigated treatments (Ave_IR), average over drought treatments (Ave_DR), and averaged across all environments (AAE).
Supplementary Table 7 | Significant novel SNPs associated with canopy wilting and their potential candidate genes based on 101 identified SNPs from twelve environments under irrigated and drought treatments, averaged across irrigated treatments (Ave_IR), averaged across drought treatments (Ave_DR), and averaged across all environments (AAE).