Investigation of the Influence of Leaf Thickness on Canopy Reflectance and Physiological Traits in Upland and Pima Cotton Populations

Many systems for field-based, high-throughput phenotyping (FB-HTP) quantify and characterize the reflected radiation from the crop canopy to derive phenotypes, as well as infer plant function and health status. However, given the technology's nascent status, it remains unknown how biophysical and physiological properties of the plant canopy impact downstream interpretation and application of canopy reflectance data. In that light, we assessed relationships between leaf thickness and several canopy-associated traits, including normalized difference vegetation index (NDVI), which was collected via active reflectance sensors carried on a mobile FB-HTP system, carbon isotope discrimination (CID), and chlorophyll content. To investigate the relationships among traits, two distinct cotton populations, an upland (Gossypium hirsutum L.) recombinant inbred line (RIL) population of 95 lines and a Pima (G. barbadense L.) population composed of 25 diverse cultivars, were evaluated under contrasting irrigation regimes, water-limited (WL) and well-watered (WW) conditions, across 3 years. We detected four quantitative trait loci (QTL) and significant variation in both populations for leaf thickness among genotypes as well as high estimates of broad-sense heritability (on average, above 0.7 for both populations), indicating a strong genetic basis for leaf thickness. Strong phenotypic correlations (maximum r = −0.73) were observed between leaf thickness and NDVI in the Pima population, but not the RIL population. Additionally, estimated genotypic correlations within the RIL population for leaf thickness with CID, chlorophyll content, and nitrogen discrimination (r^gij = −0.32, 0.48, and 0.40, respectively) were all significant under WW but not WL conditions. Economically important fiber quality traits did not exhibit significant phenotypic or genotypic correlations with canopy traits. Overall, our results support considering variation in leaf thickness as a potential contributing factor to variation in NDVI or other canopy traits measured via proximal sensing, and as a trait that impacts fundamental physiological responses of plants.

Many systems for field-based, high-throughput phenotyping (FB-HTP) quantify and characterize the reflected radiation from the crop canopy to derive phenotypes, as well as infer plant function and health status. However, given the technology's nascent status, it remains unknown how biophysical and physiological properties of the plant canopy impact downstream interpretation and application of canopy reflectance data. In that light, we assessed relationships between leaf thickness and several canopy-associated traits, including normalized difference vegetation index (NDVI), which was collected via active reflectance sensors carried on a mobile FB-HTP system, carbon isotope discrimination (CID), and chlorophyll content. To investigate the relationships among traits, two distinct cotton populations, an upland (Gossypium hirsutum L.) recombinant inbred line (RIL) population of 95 lines and a Pima (G. barbadense L.) population composed of 25 diverse cultivars, were evaluated under contrasting irrigation regimes, water-limited (WL) and well-watered (WW) conditions, across 3 years. We detected four quantitative trait loci (QTL) and significant variation in both populations for leaf thickness among genotypes as well as high estimates of broad-sense heritability (on average, above 0.7 for both populations), indicating a strong genetic basis for leaf thickness. Strong phenotypic correlations (maximum r = −0.73) were observed between leaf thickness and NDVI in the Pima population, but not the RIL population. Additionally, estimated genotypic correlations within the RIL population for leaf thickness with CID, chlorophyll content, and nitrogen discrimination (r gij = −0.32, 0.48, and 0.40, respectively) were all significant under WW but not WL conditions. Economically important fiber quality traits did not exhibit significant phenotypic or genotypic correlations with canopy traits. Overall, our results support considering variation in leaf thickness as a potential contributing factor to variation in NDVI or other canopy traits measured via proximal sensing, and as a trait that impacts fundamental physiological responses of plants.
Keywords: abiotic stress, leaf thickness, canopy reflectance, cotton, high-throughput phenotyping, specific leaf weight INTRODUCTION Field-based high-throughput phenotyping (HTP) offers the potential of rapidly and accurately characterizing phenotypic variation in large populations grown under conditions that are relevant to commercial crop production (reviewed in White et al., 2012;Pauli et al., 2016b). Most methods proposed for HTP under field conditions employ measurements of reflected radiation or thermal emissions from the crop canopy. For such measurements, the uppermost leaves in the canopy are usually the dominant visible component, unless reproductive organs have emerged above the foliage, with which light interacts. In characterizing crop traits via proximal sensing methods using instruments mounted on high-clearance tractors or unmanned aerial vehicles, it is important to understand how variation in leaf traits affect canopy reflectance. One such trait that is of particular importance is the physical thickness of a leaf.
Leaf thickness largely determines the length of the optical path of light through a leaf and the number of anatomical features (e.g., cell walls and chloroplasts) that either reflect, absorb, or transmit light. The trait also has important relationships with biomass partitioning, net productivity and crop response to water deficits. A fundamental tradeoff exists between partitioning strategies that favor thinner leaves with a greater leaf surface area per unit leaf mass, as opposed to thicker leaves and less leaf area (Poorter and Remkes, 1990). While greater surface area has the potential to increase light interception, thicker leaves typically have greater photosynthetic rates (Pettigrew et al., 1993). Water deficits are often associated with leaf thickness and otherwise affect traits associated with leaf thickness such as leaf water content, osmotic potential, and transpiration, which may relate to compensation for reduced expansion of leaf surfaces (area).
Leaf area index (LAI, the total leaf area per unit area of land) can be expressed as the product of leaf mass per unit land area (L) and the specific leaf area (SLA), where the SLA is the ratio of leaf area to leaf mass (fresh or dry). To provide a more direct association with leaf thickness, the inverse of SLA, the specific leaf weight (SLW) is used, and we subsequently emphasize SLW. Although the relation of physical thickness to SLW is somewhat complicated by variation in water content and in the volume of gas-filled space in the mesophyll, leaf thickness usually varies proportionally with SLW. Also, SLW often is proportional to concentrations of chlorophyll and total leaf nitrogen when concentration is expressed on a leaf area basis (White and Montes, 2005). Leidi et al. (1999) detected large variation in SLW of cotton and also found that SLW decreased with transpiration efficiency, measured as carbon isotope discrimination (CID) and seed cotton yield. Given the evidence of relationships between SLW and CID and the value of CID as an integrative measure of transpiration efficiency (Farquhar et al., 1989), variation in CID relative to leaf thickness may provide insight into resource capture and partition. Additionally, nitrogen isotope discrimination (referred to as D15N hereafter) may potentially reveal how short-term variation in nitrogen cycling, nitrogen metabolism, and responses to water deficit impacts canopy reflectance traits like normalized difference vegetative index (NDVI), a general measure of crop health and biomass (Tucker, 1979;Craine et al., 2015).
The thickness of a leaf is initially established following a phase of rapid thickening growth (Maksymowych, 1973). In addition to water deficits, low temperature and high irradiance are associated with thicker leaves (Van Volkenburgh and Davies, 1977;Rawson et al., 1987;Nobel, 1999;Evans and Poorter, 2001). Although elevated atmospheric CO 2 is usually expected to increase SLW due to accumulation of assimilate (Poorter and Perez-Soba, 2002), Thomas and Harvey (1983) reported that thicker leaves in soybean (Glycine max L. Merr.) under elevated CO 2 resulted from the formation of an additional layer of palisade mesophyll.
In cotton, leaf thickness increases with main stem node position but plateaus by node 12 or 13 (Gausman et al., 1971). At the species level, variation has been observed in the diploid, A-genome donors of G. arboreum L. and G. herbaceum L. with older leaves forming an additional layer of palisade mesophyll cells on the abaxial (lower) side (Morey et al., 1974;Bhatt and Andal, 1979;Leidi et al., 1999). With respect to cultivated cotton, Morey et al. (1974) reported differences in leaf thickness among 17 lines representing the perennial races of G. hirsutum L. as well as two upland cultivars under greenhouse conditions measured in 2 and 6 month old plants.
A concern related to selection of leaf traits that might affect canopy reflectance properties is that of developmental correlations; traits affecting cell sizes within leaves may also impact the cells sizes of other tissues (White and Gonzalez, 1990;John et al., 2013). Thus, selection for traits related to leaf spectral reflectance might have undesirable effects on other useful plant traits. In perennial ryegrass (Lolium perenne L.), divergent selection for mesophyll cell size resulted in heavier seed and greater shoot dry matter for small-cell size selections (Wilson and Cooper, 1970). In cotton, a particular concern is fiber quality. Because cotton fibers are formed from unicellular epidermal hairs (Mauney, 1984), selection affecting leaf thickness also might affect epidermal hairs. Although associations among fiber quality traits and agronomic factors have been examined (Ulloa, 2006;Dabbert et al., 2017) research on how genetic variation in cell size might affect fiber quality appears to be lacking.
Recent research using proximal sensing in cotton demonstrated that spectral reflectance indices measured on crop canopies can identify genetic differences among cotton lines under well-watered and water-limited deficit conditions (Andrade-Sanchez et al., 2014;Pauli et al., 2016a). However, there exists knowledge gaps in understanding how the physical and biochemical properties of the cotton canopy itself impact canopy reflectance detected using HTP approaches to characterize genetically diverse germplasm under contrasting irrigation regimes across multiple years. The main objectives of the research described herein were to determine (1) whether genetic variation in leaf thickness or related traits affected canopy spectral reflectance measured using HTP methods, (2) whether relations existed between leaf thickness and other crop traits either through physiological or developmental correlations, and (3) identify regions of the cotton genome controlling variation in leaf thickness.

MATERIALS AND METHODS
All measurements were made on two populations of cotton. The upland (Gossypium hirsutum L.) set was the TM-1×NM24016 mapping population (Percy et al., 2006;Gore et al., 2012) of 95 recombinant inbred lines (RILs). Of the parents used to create this population, TM-1 is the current G. hirsutum genetic standard, whose genome was recently sequenced , and represents the upland ideotype in terms of relative vigor, high fertility, uniformity, and fruiting habit (Kohel et al., 1970). NM24016, in contrast, is an inbred line derived from an interspecific cross between G. hirsutum and G. barbadense with approximately 37% genomic similarity, based on DNA marker analysis, to G. barbadense. Morphologically, its traits display an intermediate phenotype between the two species (Cantrell and Davis, 2000). The second population was a diversity panel comprised of 25 Pima (Gossypium barbadense L.) lines released from 1918 to 2009, capturing a wide range of phenotypic diversity from Arizona with two additional lines originating from the Caribbean Islands. The two populations were grown in three sets of field trials from 2010 to 2012 at Maricopa, AZ (lat. 33.070 • N,long. 111.974 • W, elev. 360 m) on a Casa Grande sandy loam (fine-loamy, mixed, superactive, hyperthermic Typic Natrargids). Experimental designs, crop management and phenotyping were described previously (Carmo-Silva et al., 2012;Andrade-Sanchez et al., 2014;Thorp et al., 2015;Pauli et al., 2016a). Briefly, well-watered (WW) and water-limited (WL) irrigation trials of the upland lines were arranged as 11 × 10 (0.1) α-lattices with two replicates. Pima lines were arranged as 5 × 5 (0.1) α-lattices with four replicates. To reduce border effects, a commercial upland or Pima cultivar was planted on the sides of each replicate. One-row plots were 8.8 m long and 1 m wide with a 0.61 m alley at row ends. Plant density for both populations was ∼4.1 plants m −2 after thinning.
Crop management followed recommended practices for the desert southwest. Crops were furrow irrigated for germination and seedling establishment, and subsequently irrigated via subsurface drip. Irrigations for the well-watered (WW) regime were scheduled to refill the depleted soil water of the cotton root zone based on calculated crop evapotranspiration using the dual crop coefficient procedures of the Food and Agriculture Organization Paper 56 (Allen et al., 1998). Allowable depletion of the total available root zone soil water was set at 35% active rooting zone, with a few final adjustments to the soil water balance made based on actual soil moisture as measured via neutron probe readings. Weekly soil moisture content readings were made from 0.1 to 1.5 m, in 0.2-m increments. When 50% of plots had reached first flower, the water-limited (WL) irrigation regime was imposed by providing 50% of the water applied to the WW regime.
Dates for crop management and measurements are summarized in Supplementary Table 1, and key dates are indicated for each year on Figure 1, which also shows temperature and precipitation for each year. Samples for leaf thickness and SLW were acquired before 10:30 AM Mountain Standard Time (MST) to avoid possible changes in thickness FIGURE 1 | Daily weather during the 3 years of cotton experiments. Letters along the dashed line at the top of the graph for each year indicate the time from planting (PL) to chemical defoliation (DF), the date that the water-limited irrigation regime was initiated (WD) and the start and end dates for measurements of leaf thickness (T). The red and blue colored lines represent the maximum and minimum air temperature, respectively. Black dots denote the precipitation amounts and days on which it occurred. related to progressive water loss during the day. Leaf thickness (THK, reported as mm) was measured on five to eight fullyexpanded leaves per plot from the uppermost part of the canopy, sampling at the third or fourth interveinal region from the leaf apex. Measurements were made using a hand-held micrometer (Mitutoyo Digital Micrometer Model 293-185, Kawasaki, Japan) with a digital display and a clutch that ensured uniform pressure. Plot positions and micrometer readings were dictated in the field using Philips Voice Tracer 667/00 (Koninklijke Philips N.V., Amsterdam) digital recorders, and the resulting audio was converted to digital text via the speech recognition software Dragon Naturally Speaking (version 11 Premium; Nuance Communications, Inc., Burlington, MA, USA). We estimated a reference thickness as the mean of BLUEs for the WW regimes across the 3 years because our underlying hypothesis is that leaf thickness is a constitutive trait that affects other traits.
Relative chlorophyll content (SPAD, unitless) was obtained for five to eight leaves per plot, sampled as for thickness, using a Minolta SPAD Meter 502 (Konica Minolta Sensing, Inc., Japan). Additionally, actual chlorophyll a (Chl_a) and ab (Chl_ab) concentrations were measured using a protocol adopted from Porra et al. (1989). Harvested leaf disks, two samples per plant, were frozen to −80 • C until time of processing at which point 1 mL of 100% methanol was added to sample tubes and mixed well. Samples were then incubated at 4 • C for approximately 48 h and then mixed and spun down so that 200 µL of supernatant could be transferred to a microtiter plate and absorbance read at 652 and 665 nm using a Bio-Tek Microplate reader (Bio-Tek, Winooski, VT). Concentrations were reported as µg cm −2 .
In 2010 only, specific leaf weights were estimated for five, 1-cm diameter leaf disks cut with a leaf punch that deposited samples into a glass vial, again sampled from fully-expanded leaves in the upper canopy. The vials were refrigerated while transported to the laboratory for fresh weight determination. The weighed samples were then oven dried (70 • C) and re-weighed for calculation of specific leaf weight. Estimates of specific leaf weight were reported on fresh (SLW fr ) and dry (SLW dr ) bases in units of g m −2 .
To measure CID, leaf tissue samples were taken from six representative plants within each plot with samples taken from the upper lobe of a fully expanded leaf near the fourth node of the plant. Leaf discs were taken with a 6 mm punch and sampled directly into 1.2 mL tubes of a 96-well plate which were then promptly stored on ice in a Styrofoam cooler until brought out of the field; tissue samples were then properly preserved for subsequent analyses. Carbon isotope composition analysis was performed by the University of California, Davis Stable Isotope Facility (Davis, CA, US). In 2010, leaf discs were collected on day 231 (Julian calendar), which corresponded with the end of cotton boll development and fill. In 2011 and 2012, leaf discs were collected on days 251 and 249 (Julian calendar), respectively, which coincided with cotton fiber development and elongation. Dried leaf discs were ground to a fine powder followed by weighing and placing 1-2 mg of subsamples into foil capsules. Carbon isotope composition was determined with an isotope ratio mass spectrometer (Sercon Ltd., Cheshire, UK) and calculated as δ 13 C ( ) relative to the international Vienna Pee Dee Belemnite (V-PDB) reference standard (Farquhar et al. (1989). Carbon isotope discrimination (CID, reported as part per thousand, mole fraction, ) was then calculated by the method of Farquhar et al. (1989) using the following equation: where δ a and δ p represent the stable carbon isotope composition of the atmosphere and the plant tissue sample, respectively. On the V-PDB scale, a value of −8 was used for the free atmospheric CO 2 concentration, δ a . For nitrogen discrimination (D15N, ), values were calculated relative to atmospheric composition.
At the end of each growing season prior to mechanical harvesting, 25 bolls were sampled from each plot and processed using a laboratory 10-saw gin to collect fiber for analysis of quality. Fiber quality measurements for upland cotton were made using an Uster HVI 1000 (High Volume Instrument, Uster, Charlotte, NC) at Cotton Incorporated (Cary, NC). Fiber quality measurements for the Pima population were also made on an Uster HVI 1000 but conducted at the Fiber and Biopolymer Research Institute at Texas Tech University (Lubbock, TX). The traits measured were fiber elongation (ELO, percent), strength (STR, kN m kg −1 ), uniformity (percent), micronaire (unit), and length (upper half mean, UHM, mm). However, in the current work, fiber strength, elongation, and upper half mean are discussed, as these traits are more representative of the underlying biological process of carbon fixation.
A field-based, high-throughput phenotyping (HTP) system was used to rapidly collect proximally sensed canopy data to evaluate numerous canopy phenotypes over the 2010-12 growing seasons. The design, development, operational parameters, and field evaluation of this phenotyping platform have been previously described in detail in Andrade-Sanchez et al. (2014) and Pauli et al. (2016a). Briefly, a LeeAgra AvengerPro modified high-clearance small plot spray rig with a front, horizontal boom was used to move identical sets of sensors over four adjacent rows, with geographic positions measured with an RTK-GPS returning cm-level accuracy (∼2 cm resolution). Each set of sensors included ultrasonic proximity sensors to measure canopy height, infrared radiometers to measure canopy temperature, and active light multi-spectral crop canopy sensors to measure canopy reflectance. For the present study, only the data collected by the multi-spectral crop canopy sensor (Crop Circle ACS 470, Holland Scientific, Lincoln, NE, US) were used, which provided canopy reflectance (ρ) in three 10 nm wavebands with band centers at 670, 720, and 820 nm. The wavelength data collected from the CropCircle multi-spectral sensors were used to calculate normalized difference vegetation index (NDVI, unitless) as follows: where ρ NIR is the spectral reflectance at wavelength 820 nm in the near-infrared waveband region and ρ red is the spectral reflectance at wavelength 670 nm in the red waveband region. Measurements were taken in the early morning (0700), midmorning (1000 or 1100), afternoon (1300), and/or late afternoon (1500) with all times reported in MST. The time of day (0700, 1000, 1100, 1300, or 1500) that data were collected is referred as time of day (TOD), while the actual time, measured in minutes, that a measurement was taken is referred to as time of measurement (TOM). Only the data collected nearest to the time of leaf thickness measurements are reported; the HTP system required ∼0.75 h to traverse the complete set of experimental plots.

STATISTICAL ANALYSES
Best linear unbiased estimators (BLUEs) were estimated for each trait via iterative mixed linear model fitting using ASReml-R version 3.0 (Gilmour et al., 2009), as detailed in Pauli et al. (2016a). To assess whether the leaf thickness, physiological traits, fiber quality, and post-processed NDVI data contained outliers, we initially fitted a simplified mixed linear model for each trait using the MIXED procedure in SAS for Windows version 9.4 (SAS Institute, Cary, NC). For the physiological and fiber quality traits, the fitted model for an individual trait included the main effects of genotype and irrigation regime with their two-way interaction as fixed effects; year, year-by-genotype interaction, replication nested within irrigation regime, column nested within the two-way interaction of replication and irrigation regime, and block nested within the two-way interaction of replication and irrigation regime were included as random effects. The fitted model used for NDVI outlier removal included the main effects of genotype and irrigation regime with their two-way interaction as fixed effects; replication nested within irrigation regime and block nested within the two-way interaction of replication and irrigation regime were included as random effects. For both models, degrees of freedom were calculated via the Satterthwaite approximation. The Studentized deleted residuals (Neter et al., 1996) obtained from these mixed linear models were examined to detect outliers and remove them for subsequent analyses.
For the NDVI data sets, plot-level averages were calculated with the MEANS procedure in SAS for Windows version 9.4 (SAS Institute, Cary, NC). For each physiological and fiber quality trait, an iterative mixed linear model fitting procedure was conducted across years in ASReml-R version 3.0 (Gilmour et al., 2009): in which Y ijklmn is an individual phenotypic observation; µ is the grand mean; year i is the effect of the ith year; irg j is the effect of the jth irrigation regime (WW or WL); (irg × year) ij is the interaction effect between the ith year and jth irrigation regime; rep(irg × year) ijk is the effect of the kth replication within the jth irrigation regime within the ith year; column(rep × irg × year) ijkl is the effect of the lth plot grid column within the kth replication within the jth irrigation regime within the ith year; block(rep × irg × year) ijkm is the effect of the mth incomplete block within the kth replication within the jth irrigation regime within the ith year; genotype n is the effect of the nth genotype; (genotype × year) in is the interaction effect between the nth genotype and the ith year; (genotype × irg) jn is the interaction effect between the nth genotype and the jth irrigation regime; (genotype × irg × year) ijn is the effect of the three way interaction effect between nth genotype, the jth irrigation regime, and the ith year; and ε ijklmn is the random error term following a normal distribution with mean 0 and variance σ 2 . The model terms rep(irg × year) ijk, column(rep × irg × year) ijkl, and block(rep × irg × year) ijkm were fitted as random effects while all other terms were fitted as fixed effects. Likelihood ratio tests were conducted to remove all terms from the model that were not significant at α = 0.05 (Littell et al., 2006).
For NDVI an iterative repeated measures mixed linear model fitting procedure was conducted separately for each day in ASReml-R version 3.0 (Gilmour et al., 2009): in which Y ijklmo is an individual plot-level average; µ is the grand mean; tod i is the effect of the ith TOM within a day; irg j is the effect of the jth irrigation regime (WW or WL); (tod × trt) ij is the effect of the interaction between the ith TOM within a day and the jth irrigation regime; rep(irg × tod) ijk is the effect of the kth replication within the jth irrigation regime within the ith TOM within a day; column(rep × irg × tod) ijkl is the effect of the lth plot grid column within the kth replication within the jth irrigation regime within the ith TOM within a day; block(rep × irg × tod) ijkm is the effect of the mth incomplete block within the kth replication within the jth irrigation regime within the ith TOM within a day; tom(irg × tod) ijn is the effect of the nth minute the measurement was taken within the jth irrigation regime within the ith TOM within a day; genotype o is the effect of the oth genotype; (genotype × tod) io is the effect of the interaction between the oth genotype and the ith TOM within a day; (genotype × irg) jo is the effect of the interaction between the oth genotype and the jth irrigation regime; (genotype × irg × tod) ijo is the effect of the interaction between the oth genotype, the jth irrigation regime, and the ith TOM within a day; and ε ijklmno is the random error term following a normal distribution with mean 0 and variance σ 2 . The residual variance, ε ijklmno , was modeled using a correlated error variance structure that incorporated a constant, non-zero, correlation term (ρ) among error terms to account for correlation among multiple measures on the same experimental unit. The following terms were fitted as fixed effects in the model: All other terms were fitted as random effects. Likelihood ratio tests were conducted to remove all terms from the model that were not significant at α = 0.05 (Littell et al., 2006). For each trait, any remaining influential outliers from the final fitted model were detected on the basis of the DFFITS criterion (Neter et al., 1996;Belsley et al., 2004) in ASReml-R version 3.0 (Gilmour et al., 2009). Once influential observations were removed, the final model (2 or 3) for each trait was refitted to estimate a BLUE for each genotype across years (fiber quality and physiological traits) or within a day (NDVI) for the separate irrigation regimes. Sequential tests of fixed effects were conducted with degrees of freedom being calculated with the Kenward and Rogers approximation (Kenward and Roger, 1997) in ASReml-R version 3.0 (Gilmour et al., 2009).
For each trait, broad-sense heritability on an entry-mean basis (Ĥ 2 ) or repeatability (Piepho and Möhring, 2007) was estimated to provide a measure of how much phenotypic variation among genotypes was due to heritable genetic effects rather than to environmental or measurement error for the Pima population in the absence of pedigree or molecular marker data; in the context of the upland population (biparental family) this is only referred to as broad-sense heritability on an entry-mean basis (Ĥ 2 , referred to as heritability hereafter). Heritability was estimated for the separate irrigation regimes using a mixed linear model. To estimate heritability, models (2) and (3) were reformulated to remove the irrigation regime term. Next, all terms were then fitted as random effects in order to obtain variance component estimates. The variance component estimates from each final model for fiber quality and physiological traits were used to estimateĤ 2 (Holland et al., 2003) as follows: where σ 2 g is the estimated genetic variance, σ 2 gy is the estimated variance associated with genotype-by-year variation, σ 2 ε is the residual error variance, n year is the harmonic mean of the number of years in which each genotype was observed and n plot is the harmonic mean of the number of plots in which each genotype was observed. The denominator of Equation (5) is equivalent to the phenotypic variance, σ 2 p . The variance component estimates from the final model for NDVI were used to estimateĤ 2 (Holland et al., 2003) as follows: where all terms are as previously defined above.
Because the objectives of this study focused on understanding how genotypic differences in leaf thickness impact other phenotypes, we calculated a reference leaf thickness (reference thickness) that represented the expected phenotype under ideal conditions, i.e., no water deficit. To accomplish this, an overall BLUE was calculated for each genotype using the measurements from the WW regime. This was expected to mitigate the effects of water deficit on leaf thickness thereby minimizing confounding environmental factors that could adversely bias the estimate.
To investigate the genetic relationship among the traits, we estimated the genotypic correlations (r gij ) and their standard errors in the RIL population with respect to the two irrigation regimes. Due to the uncontrolled, multiple levels of relatedness between lines, this analysis was not possible to conduct with the Pima population. To carry out the analysis, we used a multivariate restricted maximum likelihood (REML) estimation procedure implemented in PROC MIXED of SAS version 9.4 (SAS Institute., Cary, NC) as described by Holland (2006). Prior to model fitting, the BLUEs calculated for the individual years within irrigation regime were standardized to have a mean of zero and a standard deviation of one; this was done using PROC STANDARDIZE in SAS to assist in model convergence. The model used for the RIL population to estimate variance components was as follows: where Y ijkl are the paired BLUEs for the ith and jth traits in the kth year for the lth genotype; µ is the multivariate grand mean; year(trait) ijk is the effect of the kth year on the combined ith and jth traits; genotype l is the effect of the lth genotype; (year × genotype) kl is the effect of the interaction between the kth year and the lth genotype; and ε ijkl is the random error term. The random effect terms in the model were genotype l and (year × genotype) kl while the only fixed effect was year(trait) ijk . To estimate the covariance associated with the paired ith and jth traits for the estimated BLUEs per each genotype, the REPEATED statement was used. The estimated variance components form Equation (7) were used in the following formula to derive the genotypic correlations (r gij ):r whereσ Gij is the estimated genotypic covariance between traits i and j,σ Gi is the estimated genotypic standard deviation of trait i andσ Gj is the estimated genotypic standard deviation of trait j.
To explore the effect of reference leaf thickness on specific traits, once effects of year and irrigation regime were accounted for, linear regression was performed using the GLM procedure of SAS with the model: where Y ijk is the BLUE for a given trait (as opposed to value for individual replicates), irg(year) ij is the effect of the jth year nested within the effect of the ith irrigation regime, thickness k is the reference thickness for the kth genotype, and ε ijk is the random error term following a normal distribution with mean 0 and variance σ 2 . Sums of squares are sequential (Type I) to indicate the effect of variation in leaf thickness once expected large effects of irrigation regime nested within year are considered.
Within an irrigation regime, the Pearson's correlation coefficients (r) were estimated using PROC CORR in SAS version 9.4 (SAS Institute Inc., Cary, NC) to examine relations between sets of BLUEs for different traits.
To identify the regions of the cotton tetraploid genome controlling phenotypic variation in leaf thickness, we performed quantitative trait loci (QTL) mapping within the upland RIL population. Due to lack of genotypic data and appropriate population construction/composition, QTL mapping within the Pima population was not possible. The genotyping and linkage map construction for the TM-1×NM24016 RIL population has been previously described in detail in Gore et al. (2014). Briefly, the linkage map consisted 841 molecular markers assigned to 117 linkage groups covering approximately 50% of the cotton genome; this generated a linkage map ∼2,061 cM in length.
The BLUEs for leaf thickness were used individually to map additive QTL effects with respect to the WL and WW irrigation regimes using inclusive composite interval mapping (ICIM) (Li et al., 2007(Li et al., , 2015 for biparental populations implemented in the software IciMapping v 4.0 (https:www.integratedbreeding. net). To determine the logarithm of odds (LOD) threshold value for declaring significance, a permutation procedure was run 1,000 times (Churchill and Doerge, 1994) within the IciMapping software to achieve an experiment-wise Type I error rate of α = 0.05.

RESULTS
The upland and Pima cotton lines showed large variation in leaf thickness (Table 1, Figure 2). Comparing the two sets of germplasm, the upland lines had thicker leaves (three year averages of 0.26 and 0.26 mm for the WL and WW regimes, respectively) than the Pima lines (0.23 and 0.22 mm for the WL and WW regimes, respectively). No mean effect of the irrigation regime on thickness was found for either population (P > 0.05, Table 2), but genotype-by-irrigation regime effects were detected for both populations (P < 0.01 for the upland and P < 0.0001 for the Pima). For both dry and fresh SLW, a trait that generally tracks well with leaf thickness, the irrigation regime effect was highly significant (P < 0.001) for the Pima population but nonsignificant (P > 0.05) for the upland population. The effect of the individual years on thickness was large for Pima (P < 0.001), whereas for the upland population, no year effect was detected (P > 0.05), but again, large genotype-by-year effects were found for both populations ( Table 2). The broad-sense heritability of leaf thickness was generally high (>0.60) across the years and irrigation regimes.
Other leaf physiological traits (chlorophyll a and ab, SPAD, CID, and D15N) displayed a marked contrast between the upland and Pima populations with respect to the effect of irrigation regime. For chlorophyll content (a and ab), CID, and SPAD readings, the effect of irrigation regime was nonsignificant for upland but highly significant (P < 0.0001; Table 2) for the Pima population. D15N did not vary with irrigation regime and showed no genotype-by-irrigation regime effect for either population. Of these physiological traits, SPAD, CID, and D15N all displayed highly significant (P < 0.0001, Table 2) genotypeby-year interaction effects for both populations.
The use of a novel HTP system enabled us to collect NDVI data under actual field conditions on both the upland and Pima populations at multiple times per day over the growing season. In comparing the two populations, the mean NDVI values were not significantly different (two-sided t-test, P > 0.05, Table 3), and both populations displayed higher values under WW conditions, as expected. Interestingly, in 2010 the Pima population had a larger range of NDVI values but in years 2011 and 2012, the upland population exhibited a much larger range of values; in 2012 alone the range of values was more than twice that of the Pima population. The high estimates of broad-sense heritability (0.80-0.99) demonstrate that NDVI measurements collected by the HTP system were repeatable.
The three cotton fiber quality traits investigated in this study varied in response to genotype and irrigation regime, with effects ranging from nonsignificant to highly significant (P < 0.0001), but year and genotype-by-year effects were all highly significant (P < 0.001, Table 2). The heritability values for these three traits were also high with the lowest reported value being 0.81 for fiber elongation in the WW irrigation regime in 2011 (Supplementary  Table 2). This finding is not surprising as fiber quality traits are generally highly heritable and exhibit low environmental variance (Pauli et al., 2016a;Dabbert et al., 2017).
In examining relations between reference leaf thickness and individual traits, patterns varied between the two sets of germplasm and in some instances, with year or irrigation regime ( Table 4). The two populations also varied for relationships between leaf thickness and NDVI. For NDVI of the Pima population (Figure 3; Table 4), there were highly significant, strong correlations (maximum of −0.73, P < 0.001) with leaf thickness but in the upland population, none of the correlations were significant. The correlations between the concentrations of chlorophyll a and ab with leaf thickness and reference thickness were generally positive in both populations; however, there were more than three times as many significant associations among reference thickness and chlorophyll content ( Table 4). The SPAD values also exhibited a positive relationship with leaf thickness, but fewer correlations were significant (Table 4; Figure 4). Specific leaf weight, measured only in 2010, showed varied relations with actual and reference thickness (Supplementary Table 3). Correlations were strongest for SLW fr under WL conditions, and only two of eight correlations were significant for SLW dr . As reported for common bean (White and Montes, 2005), associations between SLW and thickness were weaker than implied by studies that assert a direct equivalence between the two traits, thus emphasizing that SLW is an imperfect proxy for leaf thickness.
The genotypic correlations estimated for the RIL population provided insight into the potential genetic relationship among traits. Under the WW conditions, leaf thickness exhibited significant genotypic correlations with chlorophyll content, both a and ab, D15N, and CID (r gij values ranging from −0.32 to 0.49, P < 0.05 to 0.01, Table 5); these same pairwise trait correlations were not significant under the WL regime. The contrast between treatments is not unexpected given the significant genotype-byirrigation effect detected for leaf thickness ( Table 2). The effect of the irrigation regime on genetic correlations was also evident for two other trait-pairs, namely NDVI/D15N and NDVI/SPAD. For SPAD, the genotypic correlation was only significant under the WL regime whereas for NDVI with D15N, the correlation was only significant in the WW conditions but its value, −0.69, was three times that of the value for the WL conditions, −0.23.
Consistent with the expectation that thicker leaves are associated with increased water use efficiency, and hence lower CID, the overall trend was that CID decreased with increasing leaf thickness (Table 4; Figure 5). This negative relationship between CID and thickness was also observed in the genetic  correlations under WW conditions ( Table 5). For the upland population only the correlation in 2010 under WW conditions was significant (r = −0.22, P < 0.05) between reference leaf thickness and CID. However, for the Pima population four of the six possible correlations between reference leaf thickness and CID were significant (P < 0.05) with correlation values (r) ranging from −0.41 to −0.56; three of those significant correlations were observed under WW conditions. Otherwise, CID showed no consistent phenotypic trends with NDVI or SPAD values (Table 4). However, CID did display significant genetic correlations with NDVI under WW conditions as well as chlorophyll a and ab under both irrigation regimes. In assessing possible relationships between leaf thickness and fiber quality, neither the upland nor the Pima populations showed effects of either reference leaf thickness or singleseason/treatment thickness values (Supplementary Figure 1;  Supplementary Table 4). However, when assessing the relationship of fiber quality with NDVI and SPAD values, the two populations exhibited markedly different characteristics. The Pima fiber quality traits all had significant, negative correlations with NDVI, and with regard to SPAD, fiber length and strength had significant, negative correlations; the upland population exhibited correlations close to zero for these associations (Supplementary Table 4).
Given the effects of year and irrigation regime on crop traits ( Table 2), multiple linear regression was used to estimate whether variation in key traits was explained by the reference leaf thickness once mean effects of irrigation regime and year were considered ( Table 6). For NDVI in the upland RIL population, variation in reference leaf thickness explained only 1% of the residual sums of squares whereas for the Pima population, reference leaf thickness explained a significant (P = 0.01) amount, 5%, of the residual variance. For chlorophyll a, reference thickness had a much more significant effect (P < 0.001) on the trait; it explained 10 and 8.7% of the residual trait variance for the RIL and Pima populations, respectively. The trait that exhibited the largest difference between populations with respect to the portion of variance explained by leaf thickness was CID. Leaf thickness explained over 17% of the variation in CID in contrast to only accounting for ∼3% in the RIL population. Combined, these results further support the conclusion that leaf thickness contributes to the variation observed in leaf physiological traits.
Finally, the QTL analysis revealed four unique genomic locations, on chromosomes D02, D03, D08, and D09, responsible for the variation in leaf thickness ( Table 7). The detected QTL on D09 was identified under both irrigation regimes, and on average, explained 13.40% of the observed variation. Of the remaining identified QTL, which were all detected in the WL irrigation regime, the one located on D08 explained the largest portion of phenotypic variation at 18.58% and had an effect estimate of 0.006 mm.

DISCUSSION
Field based HTP allows for the rapid collection of valuable phenotypic data under real-world production conditions, such as heat and drought stress. Central to utilizing these data for crop improvement is understanding how basic morphometric properties of the plant canopy impact radiometric properties. This knowledge will be critical as the plant science community transitions into working with larger genetic populations such as the planned 5,000 line upland cotton nested association mapping (NAM) panel and the currently in-development G. barbadense diversity panel of ∼400 lines (White et al., 2012;Hinze et al., 2016). However, before these larger populations can be leveraged to their full extent, a foundational knowledge of leaf properties must be developed in order to account for the effects when largerscale phenotyping projects are initiated; these larger populations represent a much more complex genetic system. To address this knowledge gap, we undertook the present study using tractable experimental populations of 95 upland RILs and a modest sized collection of 25 Pima cultivars. These panels were selected because of their past characterization, and with respect to the RIL population, serve as a benchmark resource within the cotton genetics community Andrade-Sanchez et al., 2014;Fang et al., 2014;Thorp et al., 2015). We evaluated both populations under contrasting irrigation regimes to assess the effects of leaf thickness on spectral reflectance measured using HTP methods. The relationships between leaf thickness and other physiological and fiber quality traits were also assessed to identify potential shared biology resulting from simple variation in leaf thickness.
The upland (G. hirsutum) and Pima (G. barbadense) populations both exhibited variation for leaf thickness, and broad-sense heritabilities were generally high regardless   of irrigation regime (Table 1, Figure 2). This finding, in combination with the QTL identified in the upland RIL population, provides further evidence that leaf thickness is a trait with a strong genetic basis in cotton. With respect to the actual leaf thicknesses, the upland RILs consistently had thicker leaves than the Pima lines, on average 0.035 mm thicker. Although the main effect of irrigation regime was nonsignificant for the two populations studied, the interaction effects of genotype-by-irrigation regime and genotype-by-year were highly significant confirming that genotypes from both species responded differentially to growing conditions. This can be exemplified by the decline in thickness for the Pima population in 2012 relative to 2010 and 2011 (Figure 2). In 2012, due to a period of rainy weather (Figure 1), thickness measurements were delayed which may have permitted new leaves to emerge. If these new leaves were formed under lower irradiance conditions, they would be expected to be thinner (Patterson et al., 1977;Evans and Poorter, 2001), which  suggests that leaf thickness of Pima germplasm may be sensitive to prior weather or management on a time scale of a few weeks.
Several apparent differences between the upland and Pima populations highlight the diversity in genetic composition and the consequences that diversity can have on phenotypic  relationships. With respect to effect of the irrigation regime on all traits other than leaf thickness, a stark contrast is observed between the two populations; excluding leaf thickness, eight out of the 10 traits for the Pima population showed highly significant (P < 0.01) irrigation regime effects in contrast to the upland population where only two traits were significant for irrigation. This observation, in combination with the differences in correlation values for NDVI and leaf thickness, as well as the higher heritability estimates for the Pima population (one-sided t-test, P < 0.01), highlight the different genetic structures of the two germplasm assemblages. The upland population only captures the genetic variation present in just two parental genotypes whereas the Pima population is composed of genotypes representing 90 years of breeding and selection. Because of this difference in population composition, there is more genetic and allelic variation present in the Pima population that likely impacts the differences in phenotypic variation as well as response to water deficit (Falconer and Mackay, 1996). These genetic and phenotypic differences are further supported by the developmental history of American Pima lines which involved the intercrossing of germplasm from various geographical regions, including germplasm of Peruvian and Sea Island descent (Peebles, 1954;Feaster and Turcotte, 1962;Smith et al., 1999;Percy, 2009). However, there is an associated limitation in using a diverse panel of Pima lines that span a time continuum and capture more genotypic and phenotypic diversity than that of a biparental population. The statistically significant correlations observed between NDVI and fiber quality traits in the Pima population must be carefully interpreted as they are confounded by breeding history and overall plant improvement. The earliest released lines had low leaf/stem biomass yield but these characteristics progressively increased over time due to selection for plant productivity along with simultaneous genetic improvements to stress tolerance (or avoidance), yield, and fiber quality. Further compounding the issue of trait correlations is the relatedness among the lines themselves as superior genotypes (those lines that were released for commercial production) or close relatives were likely used as parents for the next cycle of breeding. Without molecular marker data or pedigree information, we were unable to account for this relatedness in our analyses, an area of potential improvement in our current work because line relatedness and year of release could impact other correlations as well. Correlations between NDVI and fiber quality traits were nonexistent in the upland population. Such a lack of association is likely due to having two mostly modern parental genotypes as population founders and a population mating design that reshuffled parental genomes by recent recombination during RIL development. Taken together, this essentially negated the issues of release date and population structure.
Despite these differences in genetic structure between the two populations, the observed contrasts in the physical properties of the plants themselves are still likely due to underlying physiological differences for abiotic stress tolerance between the two species (Dabbert and Gore, 2014). Upland cotton is generally considered better adapted to drought given its Mesoamerican origin compared with Pima which originated from northwest South America near bodies of water (Saranga et al., 2004;Wendel et al., 2010). Because of their divergent origins, both species may have evolved different methods for environmental adaptions to stress environments like those conditions found in our study (Saranga et al., 1998). This contrast in adaptive ability is further supported by Saranga et al. (2004) who found that there was contrasting loci with favorable allelic variation in either species for stress-adaptive traits. Evidence of this nature provides some insight into how these two species respond to environmental conditions and give rise to the observed differences between the species and populations used.
Correlations between leaf thickness and NDVI for the upland population were low in contrast to the Pima population, which had strong, negative correlations between the two traits. For the Pima lines, NDVI decreased with greater thickness (Table 4), which is consistent with the expectation that thicker leaves may be associated with reduced leaf area and hence NDVI. This result raises the question about the utility of using NDVI, or more generally spectral reflectance data, as a selection tool for leaf thickness. Previous laboratory-based analyses using passive hyperspectral sensors with individual leaves have detected strong correlations between leaf thickness and NIR reflectance (wavelengths ranging from 750 to 1,350 nm) in cotton  as well as diverse species (Knapp and Carter, 1998;Seelig et al., 2008). In comparison, our study utilized an active, multispectral radiometer with only one NIR band (820 nm) measuring canopy-level reflectance in the field. Our fieldbased, canopy-level results suggest that if there is an appreciable amount of phenotypic variation, such as in an association mapping panel or a diverse collection of elite cultivars, NDVI could potentially be a useful selection tool for leaf thickness. However, NDVI measurements within breeding families, like the RIL population used in this study, may not adequately discriminate leaf thickness amongst related lines given the low correlation values we observed. To extend this work, further research is needed to exclude alternate factors such as differences in canopy architecture or leaf anatomy, including possible gene pool differences in leaf thickness as found in common bean (Phaseolus vulagris L.) (Sexton et al., 1997), to better understand the dynamics of NDVI as related to leaf thickness. Overall, the trends with NDVI support our proposition that FB-HTP involving canopy reflectance measurements should consider phenotypic variation in leaf thickness as an underlying cause of variation in NDVI with potentially large effects on other physiological traits.
The correlations between leaf thickness and other leaf traits were consistent with the expectation that thicker leaves would have a greater chlorophyll concentrations and hence SPAD readings. Weak negative correlations with CID agreed with previous research where genotypes with thicker leaves had greater transpiration efficiency (Rao and Wright, 1994;Rebetzke et al., 2008). This assessment is further supported by the genetic correlation analyses carried out in the RIL population. The genetic correlations revealed a significant negative relationship between leaf thickness and CID and positive correlations with chlorophyll content (both chlorophyll a and ab) and D15N under WW conditions. This finding suggests a shared genetic basis between leaf thickness and these physiological traits, 6 | Analysis of variance (ANOVA) for multiple regressions that test for influence of reference leaf thickness on NDVI, chlorophyll a concentration, SPAD, and carbon isotope discrimination (CID) once effects of irrigation regime within years are considered. Thus, tests are for sequential (Type I) sums of squares (SS).  and furthermore, emphasizes the value in understanding how genetic variation in cotton leaf thickness affects fundamental physiological crop traits. In contrast, the lack of phenotypic and genotypic associations between leaf thickness and fiber quality parameters ( After accounting for the effect of irrigation, the use of a reference leaf thickness value (a derived trait representing the idealized phenotype not confounded by environmental effects) for linear regression provided a means to assess the impact of leaf thickness on other canopy component traits. Although percent variation explained by reference thickness was low, which may be due to the shortcoming of using a reference value based on only 3 years of data, the estimated portions of variance were still significant, especially for the traits chlorophyll a and CID. These results demonstrate how physical characteristics impact both the radiance and physiological properties of leaves. Given these findings in combination with the strong genetic basis of leaf thickness, supported by the relatively moderate to high heritability estimates and the detection of loci controlling phenotypic variability, it is clear that further investigation of this trait is warranted. Selection on leaf thickness itself, which should respond quite favorably, could be beneficial in producing more stress resilient cotton plants that are able to better maintain key fiber quality traits when faced with environmental challenges. The use of molecular markers in linkage with causal loci for leaf thickness, like those identified herein, could further aid in the selection of plants with desirable leaf characteristics. However, an unresolved issue is whether leaf thickness is best measured manually, as done here, or can be related to data from proximal or remote sensing either through direct associations with specific reflectance indices or via inversion of a radiative transfer model (Thorp et al., 2015).

CONCLUSION
Measuring the thickness of cotton leaves with a micrometer allowed for reliable non-destructive sampling that identified large genetic differences for both upland and Pima cotton populations. The Pima lines showed potential relations with NDVI that support a tradeoff between thicker leaves and reduced canopy development and suggest a potential confounding factor in using canopy reflectance in FB-HTP. Leaf thickness also affected CID, more so in the Pima population where a greater proportion of significant correlations were observed than in the upland population, implying a direct effect on leaf-level transpiration efficiency. However, variation in thickness was not associated with fiber quality. Line-by-year and line-by-irrigation regime interactions emphasize the need to understand how leaf thickness might vary with in-season environmental conditions, especially in large-scale phenotyping efforts. Overall, our results support considering variation in leaf thickness as a potential contributing factor to variation in NDVI or other traits measured via proximal or remote sensing and as a trait that impacts other physiological responses.