Proximal Hyperspectral Imaging Detects Diurnal and Drought-Induced Changes in Maize Physiology

Hyperspectral imaging is a promising tool for non-destructive phenotyping of plant physiological traits, which has been transferred from remote to proximal sensing applications, and from manual laboratory setups to automated plant phenotyping platforms. Due to the higher resolution in proximal sensing, illumination variation and plant geometry result in increased non-biological variation in plant spectra that may mask subtle biological differences. Here, a better understanding of spectral measurements for proximal sensing and their application to study drought, developmental and diurnal responses was acquired in a drought case study of maize grown in a greenhouse phenotyping platform with a hyperspectral imaging setup. The use of brightness classification to reduce the illumination-induced non-biological variation is demonstrated, and allowed the detection of diurnal, developmental and early drought-induced changes in maize reflectance and physiology. Diurnal changes in transpiration rate and vapor pressure deficit were significantly correlated with red and red-edge reflectance. Drought-induced changes in effective quantum yield and water potential were accurately predicted using partial least squares regression and the newly developed Water Potential Index 2, respectively. The prediction accuracy of hyperspectral indices and partial least squares regression were similar, as long as a strong relationship between the physiological trait and reflectance was present. This demonstrates that current hyperspectral processing approaches can be used in automated plant phenotyping platforms to monitor physiological traits with a high temporal resolution.


INTRODUCTION
One of the major challenges in plant biology and crop research is elucidating the link between genome, physiological processes, plant trait performance, and yield across genotypes and environments. This requires combining vast amounts of genotypic data with corresponding phenotypic measurements (Yang et al., 2014;Clauw et al., 2016). To facilitate the collection of phenotypic measurements, high-throughput phenotyping platforms have been developed to automate data collection on a large number of plants (Granier et al., 2006;Busemeyer et al., 2013;Virlet et al., 2017). These platforms are often equipped with imaging systems that capture data non-destructively and monitor plant features over time (Humplík et al., 2015).
One imaging technology that has gained popularity in recent years is hyperspectral imaging, which can record high spectral resolution data of the visible and near-infrared (VNIR) and shortwave-infrared (SWIR) regions of the electromagnetic spectrum. Depending on the spectral range of these sensors, a wide variety of physiological traits can be studied, such as photosynthetic efficiency and pigment content (Inoue et al., 2008;Yendrek et al., 2017), leaf thickness (Neilson et al., 2015), water (Kim et al., 2015), nitrogen (Serrano et al., 2002;Yendrek et al., 2017), and lignin (Serrano et al., 2002) content. This wide range of physiological relationships has resulted in the use of hyperspectral sensors for research areas including disease detection (Huang et al., 2007), drought stress (Römer et al., 2012), pigment content (Yendrek et al., 2017), photosynthetic activity (Yendrek et al., 2017;Fu et al., 2019), and yield (Zhang and He, 2013). Hyperspectral studies have been performed on both landscape level (remote sensing), and plant or organ level (proximal sensing). Proximal sensing has been applied on both field and in-door phenotyping studies by mainly using nonimaging spectrographs (Odilbekov et al., 2018;Ge et al., 2019;Osco et al., 2020). More recent studies on the other hand have utilized hyperspectral imaging for close-range phenotyping on in-door automated platforms, allowing the monitoring of spatial and temporal variations in traits that were previously inaccessible (Ge et al., 2016;Moghimi et al., 2018;Thomas et al., 2018).
In-door automated phenotyping platforms provide a more controlled phenotyping environment and higher spatial resolution compared to remote sensing and closerange field setups. This difference in experimental setup may influence the effectiveness of common hyperspectral processing approaches, such as indices and machine learning algorithms. Indices are commonly used in remote sensing to reduce the voluminous multidimensional data, which can be challenging to analyze. Some hyperspectral indices have been used in in-door phenotyping studies to detect the effects of extreme temperature and salinity stress (Simko et al., 2016;Feng et al., 2018). Many hyperspectral indices have been developed though on data collected during severe biotic or abiotic stresses to increase the measurement range that can be linked to reflectance (Gamon et al., 1992;Peñuelas et al., 1993;Gitelson and Merzlyak, 1997;Gitelson et al., 2002). Consequently, these indices perform well in severe stress studies, while it is uncertain how sensitive they are to more subtle physiological differences, which can be more easily investigated under controlled conditions. More recent studies have demonstrated the use of machine learning algorithms to analyze hyperspectral data collected in in-door phenotyping platforms. These algorithms can learn the relationship between the plant spectrum and a trait in an automated manner. Several algorithms, such as support vector machines, simplex volume maximization (Thomas et al., 2018), artificial neural networks (Kong et al., 2014) and partial least square regression (PLSR; Pandey et al., 2017), have been used successfully to predict phenotypic traits or classify different degrees of stress.
The aforementioned methods are also affected by the higher resolution in proximal in-door phenotyping setups, as the effect of plant geometry on illumination and reflectance variation is more pronounced. This additional non-biological reflectance variation can mask biological effects and complicate hyperspectral data interpretation. Several methods, such as 3-dimensional (3D) modeling and standard normal variate normalization, have been proposed to reduce this variation (Vigneau et al., 2011;Behmann et al., 2016;Roscher et al., 2016;Asaari et al., 2018). These approaches require, however, additional 3D information or perform transformations, which may complicate data interpretation by limiting the use of indices (Asaari et al., 2018). An alternative and more intuitive method to reduce illumination effects is to subdivide plant pixels into sunlit and shaded classes. The effectiveness of this approach has been demonstrated in remote sensing (Clark et al., 2005), whereas its usefulness in in-door automated phenotyping platforms is not yet established. In this study, the performance of this alternative approach to reduce the more pronounced illumination effects in proximal sensing was examined by performing a light classification on the plant pixels.
Besides examining the non-biological reflectance variation associated with a proximal imaging setup, the use of spectral measurements for proximal sensing and its application with regard to diurnal, developmental and drought-induced responses were investigated in an automated high-throughput phenotyping system. The prediction accuracies of vegetation indices and PLSR models were compared in order to determine whether common hyperspectral data analysis approaches could provide accurate proxies for physiological plant traits. The case study chosen for this analysis was the effect of drought stress, i.e., soil water deficit and high vapor pressure demand, on the reflectance and physiology of maize. The experiments focused on moderate drought stress that allowed maize plants to continue their growth and development at a reduced rate without experiencing senescence.

Experimental and Hyperspectral Imaging Setup in the PHENOVISION Automated Plant Phenotyping System
PHENOVISION is a phenotyping platform located in the greenhouse infrastructure of the VIB-UGent Center for Plant Systems Biology (Ghent, Belgium), which is primarily used for maize phenotyping. The platform allows the monitoring of 392 plants throughout their vegetative development by nondestructive imaging. 1 It consists of a conveyor belt system that transports plants from the growth area to the automated weighing-irrigation stations and imaging cabins. These imaging cabins contain three camera systems: a red-green-blue camera system in a multi-view setup for growth-related phenotyping, a thermal infrared camera for estimating plant water use behavior, and a proximal top-view hyperspectral imaging system for physiological phenotyping. The hyperspectral imaging system consists of two pushbroom line scanner spectrographs (VNIR and SWIR) mounted on a motorized linear stage (1.5 m in length) in a dedicated cabin that is equipped with a white reference surface, halogen lighting frames that move alongside the cameras, and a lift with rotating platform, which positions the plant at the optimal distance from the top-view cameras and at the level of the white reference surface. The spectrographs possess a spectral range of 400-1,000 nm (ImSpector V10E, SPECIM, Finland) and 970-2,500 nm (ImSpector N25E, SPECIM, Finland). The spectral resolution of the VNIR V10E is 0.72 nm. A spectral binning of four was applied, reducing the resolution to 3 nm (194 bands). The V10E contains 1312 pixels per line, which were binned by four to match the SWIR spectrograph producing 328 pixels. The SWIR N25E has a spectral resolution of 6.3 nm (256 bands) and contains 320 pixels per line. In total, 511 lines were scanned to create an image of 511 × 328 or 320 pixels. The focal lengths of the V10E and N25E are 18.5 and 15 mm, respectively. Both spectrographs have a field of view of 0.75 m and a spatial resolution of 2.35 mm at 1.2 m, which is the optimal distance from the scanners regarding focal depth. The spectral data acquired by the line scanners consists of the aerial portion of a single plant, the white reference surface, and a black reference (camera shutter closed).
Hyperspectral imaging in PHENOVISION has been optimized for high-throughput phenotyping with data acquisition occurring at a rate of one minute per plant (time between the entrance and exit of each plant in the imaging cabin).
For this study, two drought stress experiments were performed: an exploratory experiment (EXP) that investigated the effect of drought on reflectance, and a validation experiment (VAL) in which the drought-induced changes in reflectance were linked to physiological traits. Maize plants were grown in a semi-controlled environment in which air temperature was set to 22-23 • C and vapor pressure deficit (VPD) to 1-1.2 kPa by means of air relative humidity modifications until the V5 stage of maize vegetative development, i.e., during seeding establishment. After the V5 stage, the environment system was adjusted to provide a diurnal gradient, with the temperature gradually changing from 22 • C at night to 28 • C in the afternoon. This diurnal temperature pattern resulted in diurnal variations in VPD of 0.95-2 kPa (night/afternoon), attempting to mimic diurnal variations in temperature and VPD under field conditions. A 16/8-h day/night light cycle was implemented in the greenhouse using high-pressure sodium vapor lamps, to achieve an average light intensity of 280 µmol m −2 s −1 . The greenhouse photosynthetically active radiation (PAR), relative humidity, VPD and temperature were continuously monitored by four environmental monitoring stations containing an SKH 2053 humidity and temperature sensor, and a PAR SKL 2625 sensor (Skye Instruments, United Kingdom). Maize (Zea mays L.) inbred B104 (Liu et al., 2003) plants were sown in 7-l pots filled with 850 g of peat-based soil with osmocote fertilizer (N.V. Van Israel, Belgium) and were randomly placed on the PHENOVISION platform. The plants were fertilized weekly with 40 ml of 200 ppm N Peters Excel CalMag Grower (Everris, Netherlands) solution once the V5 stage was reached.
During the two experiments, maize plants were monitored from emergence until the V7 stage (Ritchie et al., 1996). Plants were randomly subdivided in three groups: well-watered (WW), water deficit (WD) and border plants. EXP included 77 WW, 76 WD, and 80 border plants, while VAL had 82 WW, 77 WD, and 68 border plants. In both experiments, the plants were irrigated to a WW soil water content (WC) of 2.4 g H 2 O g −1 dry soil (soil water potential of -10 kPa) until plants reached the V5 stage, after which the drought treatment was initiated. Water was withheld from WD plants for 6-7 days (acute drought period) until a soil WC of 1.4 g H 2 O g −1 dry soil (soil water potential of -100 kPa) was attained, after which plants were irrigated to sustain the WD soil WC. The end of the acute drought period was reached at the V6-V7 stage (six to seven fully developed leaves). Throughout both experiments, hyperspectral images were collected, while physiological measurements using standard measurement equipment were only obtained during the drought period of VAL.
The non-destructive measurements consisted of photosynthetic rate (A), transpiration rate (E), stomatal conductance to H 2 O (g s ), and quantum yield based on CO 2 ( CO2 ), effective quantum yield ( PS2 ) and energy harvesting efficiency by oxidized PS2 (F v /F m ) and were collected with a portable LICOR 6400-XT Infrared Gas Analyzer (LI-COR Biosciences, United States). Within the leaf chamber, a steady-state CO2 level of 400 µmol −1 was maintained, while temperature and PAR were adjusted to the greenhouse temperature (25-31 • C) and PAR (230-360 µmol photons m −2 s −1 ) at every measurement time point. The fluorescence parameters were determined using the LI-6400 manual guidelines for light-adapted leaves.
Destructive sampling was performed on top leaves (leaves 5-9), which were visible in the hyperspectral image. The tip of the leaf was used for water potential ( ) measurements collected with a pressure chamber (PMS Instrument Company, United States), whereas 5 cm from the middle of the leaf was weighed and dried to estimate WC per g dry weight. Leaf disks were sampled to measure anthocyanin, carotenoid and chlorophyll content (Supplementary Materials and Methods 1).

Hyperspectral Data Collection and Pre-processing
Hyperspectral imaging was performed once per day between 8 AM and 3 PM in the EXP, whereas during the VAL, plants were imaged four to five times on sampling days and three times on regular days (8 AM-7 PM). The number of plants imaged per hour was ±16 for EXP and ±50 for VAL. All of the image processing was performed using the computing environment R version 3.2.2 (R Core Team, 2015) and was limited to the first 10 days after the initiation of the drought treatment (the acute drought period). The reflectance (ρ) of each image was calculated based on a white and a dark reference image which were acquired just before the scanning of the plant. The aim was to compensate for pixel-to-pixel sensor response differences and spatial non-uniformities in illumination.
Visible and near-infrared pixels of the plants were extracted from the images using the red-edge normalized vegetation index, which has a range from 0.2 to 0.9 for green vegetation. Pixels with an index value higher than 0.35 were classified as plant pixels. The segmentation of the plant in the SWIR images was performed using a random forest model that was trained on images from juvenile (V2) to mature (VT) maize plants. The random forest model was created with the "randomForest" package for R (Liaw and Wiener, 2002) and had a confusion matrix accuracy of 99.8% on an independent test dataset. The amount of segmented plant pixels that actually contained plant was 99.6%.

Reducing Non-biological Variation Caused by Inhomogeneous Illumination
Plant reflectance varied strongly within the plant image. This variation was typically non-biological as it resulted from inhomogeneous illumination of the plant surface. To reduce this variation, a light classification was performed that subdivided plant parts into classes with similar illumination. For the VNIR classification, brightness was used as a proxy for illumination and was calculated from blue (492 nm), green (539 nm), and red (651 nm) images using the "rgb2hsv" function of the "grDevices" R package. The brightness calculated with this function turned out to correspond with green reflection, which was the least affected by the drought treatments. Light classes were created by performing k-means clustering using the "stat" R package on the brightness values of a training dataset that included WW and WD plants from different developmental stages (V5-VT). This resulted in three classes: low, intermediate, and high light. Because the low and high-light class still showed a high variability, they were further subdivided with k-means clustering into extremely low, low, high, and extremely high light classes. This additional k-means clustering step removed most of the leaf edge and vein pixels from the low and high-light classes. The classified training dataset was used to determine fixed brightness thresholds, by looking at the distribution and overlap in brightness values of the light classes that were later applied on the whole dataset. The thresholds were used to create binary images for all the light classes of each plant. To assure that the classification of SWIR corresponded to that of VNIR, a coregistration of the VNIR binary images to the SWIR scanner was performed. The positions of VNIR pixels in the SWIR image were calculated using sample and line transformation formulas, which were developed by determining the linear relationship between the coordinates of corresponding chessboard points. After the classification, an average reflectance value was calculated for each plant-light class combination. Only the data of the intermediate light class were further analyzed in this case study. This light class was selected based on both the percentage of pixels per plant it contained and the ability to detect drought-related effects.
The intermediate light class also included the physiological measurement locations within the plant.

Indices
The development of indices and models, as well as a part of the statistics, were performed using the computing environment R version 3.4.3 (R Core Team, 2017). Publicly available indices were calculated from plant reflectance ( Table 1) and their performance to detect drought and to predict physiological traits was evaluated. Additionally, new indices were created from the validation experiment ( Table 2) by means of two methods. These methods used all available noise-free wavelengths (480-2470 nm) present in the hyperspectral dataset. The first method calculated all possible ratios and normalized difference ratios of wavelengths and selected the one with the highest correlation ("stats" and "rmcorr" R package) to the trait of interest (Bakdash and Marusich, 2018). The second method computed correlations for each wavelength and determined the wavelength with the highest correlation. This wavelength was subsequently combined with other wavelengths using basic mathematical operations, until the correlation between the index and trait stabilized or until the index contained four different wavelengths.

Model Development
Physiological trait predictions were accomplished with indexbased models and PLSR models. The aim of this modeling approach was to develop prediction models that can predict both developmental, diurnal and drought-induced changes in physiological traits during the acute drought period. This was achieved by training the models with physiological measurements and images collected at different time points during the day and on multiple days during the experiment. An 80% portion of the dataset (142 images of gas exchangeand fluorescence-measured plants and 88 images of -and WC-measured plants) was used to train the models, while 20% (36 and 23 images of gasexchange/fluorescence-and /WCmeasured plants, respectively) was set aside for validation. The physiological traits that were selected for modeling consisted of A, E, gs, CO2, PS2, Fv /Fm , , and WC. The index-based models had a physiological trait as dependent and an index as  Red Green Ratio Index (RGRI) RGRI = ρ red ρgreen ρ red = 600-700 ρ green = 500-600 Gamon and Surfus, 1999 Normalized Difference Vegetation Index (NDVI) independent variable. They were created with the "lm" function of the "stats" R package. The PLSR models were developed with the PLSR function ("pls" R package) using the classical orthogonal score algorithm and Leave One Out Cross-Validation (Mevik et al., 2016). This technique has been widely used to construct predictive models of data that have more variables than observations and are highly collinear, such as hyperspectral data (Geladi and Kowalski, 1986;Pandey et al., 2017;Ge et al., 2019). The PLSR algorithm will reduce the dimensionality of the data, by extracting latent variables or components that account for most of the co-variance between the dependent and predictor variables in the data, while accurately modeling the trait of interest. The number of latent variables the model used will strongly influence the accuracy and overfitting of the model. To optimize the tradeoff between accuracy and overfitting, the predicted sum of squares was used to select the number of components (Chen et al., 2004). This was accomplished by determining the number of components at which the Root-Mean-Square Error (RMSE) of the predicted sum of squares was minimal (Wold et al., 2001;Yendrek et al., 2017). The contribution of each wavelength to the PLSR model was determined using the variable importance in the projection (VIP), which reflects the importance of each wavelength in the model (Chong and Jun, 2005). The VIP was used to perform feature selection using the procedure of Korn et al. (2010). The accuracy of the index-based and PLSR models were evaluated by calculating the RMSE and R-squared (R 2 ) of the Leave One Out Cross-Validation and test data predictions with the postResample function of the "caret" R package (Kuhn et al., 2018).

Statistics
The strong correlation between adjacent wavelengths produced multicollinearity. Wavelengths were therefore subdivided into groups based on a correlation threshold of 0.80. This resulted in 11 groups for which representative wavelengths were selected, namely 523; 551; 658; 708; 721; 976; 1,482; 1,694; 1,937; 2,110; and 2,321 nm (Supplementary Table 1). These wavelengths were used to evaluate the effect of drought treatments on the whole plant and light class population averages for each time point by performing a mixed model analysis. The same modeling approach was used to test the effect of drought on the nondestructive physiological traits, while a linear model was applied for the destructive measurements. To find associations between the physiological traits and reflectance a mixed and linear model was used for the non-destructive and destructive measurements, respectively. Time of imaging effects on the EXP reflectance data was investigated with a linear model created for the sixth day after the drought initiation. To incorporate the time of day effect in the drought effect analysis of the intermediate light class the data was reanalyzed with a factorial ANOVA that tested treatment differences within each hour for each day. This ANOVA analysis was also used to evaluate the effect of drought on indices. A P-value correction was performed during every statistical analysis. More details on the models and the type of P-value corrections can be found in the Supplementary Materials and Methods 2.

Spectral Data Acquired in a Proximal Imaging Setup Are Strongly Influenced by Variations in Illumination
The reflectance data collected in our proximal imaging setup showed a high within-plant variability (Figure 1A), which was attributed to the variation in illuminance (total light flux per unit area) across the plant surface and the amount of reflected light detected by the camera sensors. Two factors determine the illuminance: the intensity of the light source and the plant surface geometry (Sandmeier et al., 1998;Behmann et al., 2015). Due to the high resolution of proximal hyperspectral imaging, the influence of geometry was very prominent as can be seen in the top-view brightness image of a maize plant (Figure 1B), where the distichous leaf organization causes brightness variations between and within leaves. These variations were attributed to differences in the distance between the leaf and the light source, shading, and the angle at which the light reached the leaf surface (angle of incidence). This had severe consequences for the interpretation of the hyperspectral data. The overall high within-plant variability in RGRI values (Figure 1C), which is a popular index to monitor the anthocyanin-chlorophyll ratio, corresponded with the observed high variation in brightness ( Figure 1B), suggesting that variation resulted from differences in light exposure, and showing that it can mask potential effects of drought on reflectance. Several methods have been proposed to remove the geometryinduced reflectance variation (Mishra et al., 2020a), such as hyperspectral 3D models (Liang et al., 2013;Behmann et al., 2016), standard normal variate (SNV) transformation (Vigneau et al., 2011;Asaari et al., 2018), variable sorting for normalization (Mishra et al., 2020b), light classification or calculating plant averages. 3D models and SNV transformation have been shown to remove linear-illumination effects, while light classification and plant averaging are also able to reduce variation caused by non-linear shading effects in thermal images (Leinonen and Jones, 2004;Jerbi et al., 2015).
In this study, two methods to reduce the within-plant variability were compared: the averaging of plant spectra, and classification based on brightness as a proxy for illumination. Brightness was subdivided into five light classes by performing an unsupervised k-means classification on a training dataset containing the different treatments and plant developmental stages. Thresholds for brightness were based on the classification results and applied to the whole dataset. Pixels that fell beyond the uppermost threshold were classified as specular reflection and removed from the analysis. To reduce the dimensionality of the data, an average plant spectrum was calculated for each of the five light classes: extremely low, low, intermediate, high and extremely high light. They showed up to 2.3 ± 0.4% difference in relative green reflectance between subsequent classes, which was related to their pixel composition. The extremely low and low light classes contained mainly shaded leaf pixels, whereas the higher light classes consisted of illuminated plant pixels ( Figure 1D). Edge pixels were observed in both extremely low and extremely high light classes, while vein pixels were mainly present in the extremely high light class. To investigate how the pixel composition affected drought detectability, 11 wavelengths were selected (523; 551; 658; 708; 721; 976; 1,482; 1,694; 1,937; 2,110; and 2,321 nm) and the effect of drought was evaluated for each light class-wavelength combination. This analysis showed that classes containing shaded leaf parts (extremely low and low) received and reflected less light, resulting in less significant differences in relative reflectance between treatments (1-4 wavelengths, P < 0.05) compared to the higher light classes (8-11 wavelengths, P < 0.01, Figure 1E). The quantity of shaded and illuminated plant parts changed during the V5-V7 drought period ( Figure 1F, day 0-5). The high and extremely high light pixel number decreased, as the plants developed more leaves, while the number of low and intermediate light pixels increased. The intermediate light class had the highest relative pixel number during the whole experiment and was consequently the most representative class ( Figure 1F). Also, shaded and vein pixels were absent and therefore the intermediate light class was selected for the comparison with the overall average plant spectrum as an alternative data processing method. Light classification and plant averaging were compared for their ability to detect drought effects on the sixth day of the soil dehydration period, when these were maximal. A more pronounced drought effect was obtained in the intermediate light class than in the plant average spectra, with 10 and 6 significantly affected wavelengths, respectively (P < 0.05) (Figure 1E). Due to the clear difference in methods, it was decided to use the intermediate light class for this case study.

Proximal Hyperspectral Imaging Systems Can Detect Subtle Diurnal Changes in Plant Physiology
The reduction of the illumination effects mainly improved the ability to detect drought in the visible, red-edge and SWIR regions of the electromagnetic spectrum, whereas the NIR region still showed a high variability (Figure 1E). The time of imaging of individual plants, which ranged from 8 AM to 3 PM on day 6 in Figure 1E, affected NIR reflection, because significant decreases in NIR (976 nm), red-edge (708 and 721 nm) and SWIR (1,482; 1,694; 1,937; 2,110; and 2,321 nm) relative reflectance (P < 0.001) was observed over this timespan (Tables 3, 4 and Figure 2). The time-of-day effect was also found in the visible region (523, 551, and 658 nm), where an increase in relative reflectance was observed as the day progressed (P < 0.001) (Table 3, Figure 2 and Supplementary Figure 1). These time-ofday effects were observed in both WW and WD treatments and Slopes are compared between treatments and between wavelengths or troughs. *P < 0.05. **P < 0.01. ***P < 0.001 Slopes are compared between treatments and between wavelengths or troughs. *P < 0.05. **P < 0.01. ***P < 0.001.
differed significantly between treatments for red-edge (721 nm), NIR (976 nm), and one SWIR wavelength (1,694 nm). The WD treatment showed a stronger negative slope than the WW treatment, which resulted in a more pronounced drought effect in the afternoon (P < 0.05) (Tables 3, 4 and Figure 2). Diurnal changes were not limited to specific wavelengths but could also be observed in the depth of the troughs, which correspond to regions where reflectance decreases abruptly. Of the five troughs in the VNIR and SWIR regions with their dips at 979; 1,232; 1,445; 1,825; and 1,955 nm ( Figure 1E), three (979; 1,445; and 1,955 nm) showed a significant negative diurnal trend for both WW and WD plants (Figure 2 and Supplementary Figure 1). A treatment difference was only observed in the 1,955 nm trough (P < 0.01) ( Table 5 and Figure 2). The observed diurnal variation had major implications for spectrum interpretation and coincided with physiological trait changes and variations in environmental conditions (Figure 3). PS2 showed a minimum around noon corresponding to the PAR maximum in the greenhouse (Figures 3C,G). Leaf decreased during the day with the lowest values in the afternoon/evening, when E and VPD had passed their peak (Figures 3B,D,F). Significant strong correlations with VPD were observed in plant reflectance at 721 nm (correlation coefficient r, r WW = -0.80, r WD = -0.79, P < 0.05), 658 nm (r WW = 0.74, r WD = 0.77, P < 0.05), 523 nm (r WW = 0.74, r WD = 0.67, P < 0.05), and the 979 nm trough (r WW = -0.80, r WD = -0.78, P < 0.05). These wavelengths were in turn significantly correlated with E and . The reflectance at 658 nm showed a similar diurnal pattern as E with a maximum in the afternoon (Figure 3A), while 721 nm was negatively correlated with E. The correlation between E and reflectance at 658 nm was significant in both WW (r WW _658 = 0.65, P < 0.05) and WD (r WD _658 = 0.48, P < 0.05) treatments on all sampling days of the experiment, except for the WD treatment on day 7. This was the last day of the acute drought period, when E was at its minimum and 658 nm reflectance at its maximum. On this day, a stronger diurnal effect on reflectance than on E was observed, suggesting an additional effect of drought on reflectance at 658 nm not directly related to E. A positive relationship was found between and the 979 nm trough, but was only significant for the WW treatment (r WW = 0.56, P < 0.05). The correlation with 523 and 658 nm was significantly negative for both treatments (523 nm: r WW = -0.51, r WD = -0.46, P < 0.05; 658 nm: r WW = -0.67, r WD = -0.41, P < 0.05).

Drought-Induced Reflectance Changes Are Correlated With Physiology
Withholding water from maize plants affected leaf reflectance in both the VNIR and SWIR areas of the electromagnetic spectrum. An increase in reflectance was observed in the bluegreen, red, and SWIR region (523; 658; 1,482; and 2,110 nm), while a decrease was present in the green, red-edge and NIR region (551, 708, 721, and 976 nm) (P < 0.01, Supplementary  Figures 3,4). Red reflection was the most sensitive to drought, as significant treatment differences were observed already two days after the onset of drought (P < 0.01, Supplementary  Figure 3). Reflection in the blue-green, red-edge and SWIR (1,482 nm) region showed drought effects on the fourth day (P < 0.01, Supplementary Figures 3,4). NIR reflectance was the least sensitive to acute drought as it only showed significant differences between treatments near the end of the drought period (day 6, P < 0.01, Supplementary Figure 3). The ability to detect these drought effects was influenced by the interaction between time-of-day and treatment, as most treatment effects were first detected in the early afternoon. This interaction was especially pronounced for the red-edge (721 nm) and NIR (976 nm) wavelengths, which have been positively correlated with leaf thickness and negatively correlated with the WC (Slaton et al., 2001;Neilson et al., 2015). During the morning, E increased ( Figure 3B), resulting in a decrease in WC, (Figure 3D and Supplementary Figure 2H) and leaf thickness (Syvertsen and Levy, 1982;Mcburney, 1992;Jinwen et al., 2009). This decrease in leaf thickness was probably more pronounced in WD than WW plants, translating into a stronger drop in WD plants' NIR and red-edge reflection, and therefore a larger treatment difference in the afternoon.
Drought had a negative effect on the measured physiological traits, which was observed earlier for E, g s , F v /F m , WC, and and later for A, PS2 , and CO2 (P < 0.05, Figure 3 and Supplementary Figure 2). All wavelength groups, except for the 2,321 nm group, correlated significantly with one or more of these traits (Figure 4 and Supplementary Table 2). The most promising reflectance-physiology relationships were observed between red reflectance (658 nm) and , PS2 or F v /F m with negative slopes (P < 0.0001, Figures 4A-C). The relationships with red reflection were significantly different between WW and WD treatments (two-tailed Student's t-test, P < 0.0001), except for the relation with . This suggests that the underlying relationship is the same for diurnal, developmental and drought-induced changes. Other treatment-independent, The trough depths are estimated by taking the difference in relative reflectance between the highest (852; 1,100; 1,232; 1,650; and 1,825 nm) and the lowest (979; 1,232; 1,445; 1,825; and 1,955 nm) wavelengths of the troughs. Slopes are compared between treatments and between wavelengths or troughs. *P < 0.05. **P < 0.01. ***P < 0.001. positive relationships were observed between 976 nm and CO2 or A (P < 0.001, Figures 4E,H). Significant treatment effects were present in the relationships of 523 nm with g s , 976 nm with WC and were very pronounced for 976 nm and E, as correlations were only present in the WW treatment (Figures 4D-G, P < 0.001). The lack of correlation in the WD treatment may be related to the limited range of E values resulting from increased stomatal closure. The drought effects on physiology were not limited to direct effects but were also influenced by differences in development, which are discussed in the Supplementary Results 1.

Both Index-Based and PLSR Models Can Accurately Predict Physiological Traits From Hyperspectral Data
The significant relationships between reflectance and physiology can be used for automated plant phenotyping by calculating vegetation indices for physiological traits or by creating PLSRbased physiological trait prediction models. Here, 10 publicly available and 10 new indices (Tables 1, 2) were evaluated for their ability to detect drought and their prediction accuracy of physiological traits by creating index-based linear models. The accuracy of these models was subsequently compared to PLSR models of the same traits, to evaluate which of the two methods was more suitable to monitor drought effects. The physiological traits that were selected for this comparison were A, E, g s , PS2 , F v /F m , , and WC, as these were significantly affected by the drought treatment. The published indices under evaluation included WC indices (Water Band Index, Moisture Stress Index and Relative Water Content index), photosynthetic efficiency indices (PRI and RGRI), pigment content indices (Modified Chlorophyll Absorption Ratio Index and Carotenoid Reflectance Index 1) and indices that used red or NIR reflection (NDVI, Ratio Vegetation Index 870/610, Ratio NIR/510) (for references see Table 1). Large differences in drought sensitivity were observed in both the existing and new indices, ranging from early drought responses to no consistent effects. The indices were grouped accordingly: very sensitive, sensitive, moderately sensitive and insensitive. The results are presented in Supplementary Results 2.
Partial least square regression has been commonly used to estimate physiological traits from hyperspectral data. It has the advantage of using all the available wavelengths, but some may turn out to add additional noise. Here, the use of PLSR slightly improved the test RMSE of E, PS2 , WC, and F v /F m by 7, 13, 8, and 12%, respectively ( Figure 5). However, , g s , CO2 , and A showed a lower PLSR accuracy than the indexbased models with a reduction in the RMSE of 10, 5, 17, and 24%, respectively (Figure 5). Maize reflectance showed a weaker relationship with A compared to the other physiological traits, resulting in a low accuracy of both index-based and PLSR models. The importance of each wavelength in the PLSR models was determined by calculating the VIP (Chong and Jun, 2005). The 30 wavelengths with the highest VIP scores were compared to those used by the indices (Supplementary  Table 3), showing that most PLSR models contained wavelength regions similar to those important in the index-based linear models. The F v /F m , PS2 , and models consistently used red and/or red-edge reflectance, while for the predictions of CO2 and g s , NIR reflectance was always important. The water absorption regions were also conserved between the indexbased and PLSR models of E, A, and WC. The wavelengths that were important in PLSR can be found in Supplementary  Table 3. One wavelength region around 2,500 nm was never used in the index-based models, but had the highest VIP value and was very important for the PLSR models, although it was located at the edge of the SWIR spectrum and may have been noisy. The comparison of PLSR and index-based models showed that when a strong relationship between reflectance and the physiological trait is present, both PLSR and indexbased models can accurately estimate physiological traits from hyperspectral data.

Within-Plant Illumination Differences Cause Large Non-biological Variation in Reflection
In this study of proximal hyperspectral imaging in the context of an automated phenotyping platform, large non-biological variation in reflectance caused by illumination differences due to plant geometry masked subtle biological differences in reflectance, including diurnal patterns and drought responses. To manage the illumination effects, the use of artificial diffuse light has been proposed to reduce the effects of geometry on hyperspectral data, but it does not remove differences caused by the distance between the leaves and the light source (Thomas et al., 2018). These effects can be reduced by combining hyperspectral data with high-resolution 3D plant models , created by means of stereo or rotating red-green-blue and hyperspectral camera systems that image the plants from different angles (Liang et al., 2013). Once 3D information of each hyperspectral pixel is available, the angle and the distance between the pixel and the light source can be calculated, and their effect can be removed by a correction factor for each pixel determined from the light field of illumination intensities . Distance and inclination effects can also be removed by using the 3D information to estimate the parameters of a linear reflectance model (Vigneau et al., 2011;Asaari et al., 2018), by incorporating distance and angle information in trait prediction models (Roscher et al., 2016) or by developing a bidirectional reflectance distribution function . Combining 3D information with hyperspectral data is a promising approach to reduce the illumination effects, however, this information is not always available, and the sheer volume of this extra data further challenges storage and processing capacities. Consequently, several alternative methods have been proposed, the simplest of which is to reduce the within-plant variation by calculating plant averages. This method removes all spatial information and reduces spectral variation without taking the source of the variation into account. By ignoring the illumination effects (variation source) a positive relationship between reflectance and the distance between the plant and the light source is created (Asaari et al., 2018), which complicates hyperspectral interpretation when plants differ in height. Vigneau et al. (2011) and Asaari et al. (2018) have proposed to model a correction using a linear reflection model, which assumes a multiplicative effect of distance and inclination and an additive effect of specular reflection. Alternative approaches to remove linear effects are spectra normalization methods such asthe SNV transformation and variable sorting normalization (Vigneau et al., 2011;Asaari et al., 2018;Mishra et al., 2020b). The SNV transformation can cause a wavelength shift in the treatment differences (Fearn, 2009), which prevents the use of existing indices and complicates the biological interpretation of the spectra. More details on the available linear illumination correction approaches have been described in the review paper of Mishra et al. (2020a).
The methods discussed above have focused on linear effects such as distance and inclination, whereas non-linear effects such as shading also strongly influence plant reflectance. Illumination classification, which subdivides plant parts based on the amount of illumination, can reduce the reflectance variation caused by linear and non-linear effects as both these factors affect brightness. This method has been used to separate sunlit and shaded leaf pixels in both hyperspectral remote sensing and thermal images (Jerbi et al., 2015). Plant pixels can be subdivided by splitting the leaves into segments from the base to the tip (Tirado et al., 2020) or by performing a brightnessbased classification. In this study, the latter was evaluated by subdividing plant pixels into five illumination classes. A kmeans clustering approach was used as a guide to determine the brightness thresholds, which subdivided plants in the minimum number of clusters required to reduce the illumination effects. The clustering approach was applied in a less conventional way without using the optimal number of clusters, as this amount varied strongly between optimization methods resulting in too little (1 cluster) or too many clusters (Charrad et al., 2014). The brightness-based classification reduced the withinplant variance in reflectance by 47 ± 24% and improved the detectability of drought effects especially in the classes that contained sunlit plant pixels (intermediate and high light). However, by using only a subset of the plant pixels, the method is less suitable for studies that focus on the spatial distribution of traits within the plant. In these situations the use of the above described linear correction methods is more appropriate. The main advantages of the illumination classification compared to the use of 3D information and normalization is that it reduces both linear and non-linear illumination effects, it does not require additional information on plant geometry and it allows the use of published indices, as the data is not transformed. The combination of these advantages makes this method more intuitive and accessible for plant scientists. The brightness-based clustering, including the appropriate number of clusters, the brightness thresholds and the optimal cluster for analysis, is setup-specific and requires a preliminary analysis as described in this study.

Hyperspectral Data Allow the Detection of Diurnal and Drought-Induced Changes in Physiology
By reducing the illumination-induced variation, subtle diurnal trends in plant reflectance were detected, which were correlated with changes in plant physiology. Diurnal effects are unavoidable in the PHENOVISION platform, because a 6-h imaging cycle is required for almost 400 individual plants. On the other hand, the results show that hyperspectral imaging delivers high-resolution plant physiology-related data and that the current setup allows the monitoring of diurnal changes in plant physiology, which is important in the timely detection of and genotypic specific sensitivity to stress, and because treatment effects can depend on the time of sampling or imaging (Figure 2; Žibrat et al., 2019). Consideration of the time-of-day has mainly been limited to the PRI index created to detect diurnal changes in the epoxidation state of the xanthophyll cycle pigments and in the photosynthetic efficiency (Gamon et al., 1992). Other studies have shown a significant correlation between PRI and F v /F m on both diurnal and seasonal scales (Garbulsky et al., 2011;Zhang et al., 2016). Here, diurnal trends in reflectance were observed in a wide spectral range, including blue-green (521 nm), red (658 nm), red-edge (708 nm), NIR (721 nm), and SWIR (1,482; 1,694; 1,937; 2,110; and 2,321 nm) wavelengths. Diurnal trends in red and NIR reflection have been observed before, but these effects were attributed to the influence of soil reflection, differences in solar angle and plant geometry (Asrar et al., 1985;Gamon et al., 1992). In PHENOVISION, these factors do not play a major role. Instead, the diurnal trends showed correlations with VPD, E, and . The strongest relationships were observed with reflection of red, NIR and the water absorption trough at 979 nm. NIR and water absorption trough reflectance have been related to WC, leaf thickness and anatomy (Slaton et al., 2001;Neilson et al., 2015), which can change during the day as the plant loses water through E (Syvertsen and Levy, 1982;Mcburney, 1992;Jinwen et al., 2009). These changes may thus explain the negative correlation of NIR and the 979-nm trough with E and VPD. A biological explanation for the strong relationship between red reflectance and E, , or VPD is less straightforward, as no clear diurnal trends in pigment content were observed. Red absorption and reflectance can be affected by the amount of light scattering within the tissue by changing cell size and shape or by changing the positioning of organelles, such as chloroplasts. Changes in leaf thickness and water loss can affect mesophyll density and cell size and shape (Chartzoulakis et al., 2002;Scippa et al., 2004;Sancho-Knapik et al., 2011;Wuyts et al., 2012), however, the few studies that investigated these effects focused on different degrees of drought to reduce leaf WC instead of looking at diurnal variations. Diurnal changes in chloroplast positioning have been observed in C4 plants as a light avoidance response with aggregative movements in the afternoon to increase mutual shading of chloroplasts (Yamada et al., 2009). Maai et al. (2011) showed that these light avoidance and aggregative movements are triggered by high levels of blue light and abscisic acid (ABA). A strong correlation was observed between red reflection and VPD, which has been associated with ABA levels, an important phytohormone involved in the regulation of g s (Merilo et al., 2018). g s was monitored during the validation experiment but only a drought-induced decrease could be observed, suggesting that chloroplast movement mainly played an important role in the drought-induced increases in red reflection.
Drought effects on plant physiology and reflectance have been studied in both lab and field experiments. The relationship between plant reflectance and physiology is complex, because drought affects almost the whole measured plant spectrum, and physiological traits can influence multiple wavelengths directly or indirectly (Römer et al., 2012), which was confirmed here. The earliest effects were observed in the visible region, where an increase in red reflectance was present two days after the onset of the drought treatment. Increases in red reflectance attributed to changes in pigment content (Sims and Gamon, 2002;Sun et al., 2018) were not visible here, instead red reflectance showed a strong correlation with photosynthetic efficiency ( PS2 or F v /F m ) and , suggesting that other factors, such as light scattering within the leaf and chloroplast positioning, may have impacted red reflectance during the drought treatment. The negative relationship between red reflectance and did not differ between treatments, implying that the underlying link is the same for developmental and drought-induced changes. Changes in light scattering will also affect NIR reflectance. The wavelength that represented NIR reflectance in this study (979 nm) is located in the NIR water absorption trough, which has been correlated with WC (Peñuelas et al., 1993;Corti et al., 2017). A correlation between WC and 976 nm was only present in the WW treatment, indicating that this wavelength could only detect diurnal and developmental changes in WC when there were no drought effects present (Figure 4), and that it is affected by multiple drought-induced physiological changes, of which WC is only a small part. Two additional water absorption troughs are present in the SWIR region (1,482 and 1,937 nm), which showed weak correlations with WC in both treatments. Correlations between these wavelengths and WC measurements have been observed in previous drought studies noting increases in SWIR reflectance as the drought effect progressed (Susič et al., 2018;Krishna et al., 2019). The strength of these relationships depended on the type of WC measurement, as stronger correlations were observed for the leaf WC per unit area than WC per leaf dry weight (Yu et al., 2000;Ceccato et al., 2001;Yi et al., 2013). A correction of the dry mass effect on SWIR reflectance is necessary to properly estimate the WC. This may be accomplished by combining SWIR with the NIR region that is only influenced by dry mass (740-900 nm) in indices or models.

Predicting Physiological Traits Using Hyperspectral Index-Based Linear and PLSR Models
Hyperspectral imaging produces voluminous multidimensional data. One way to extract biological information is to reduce the dimensionality by calculating indices that only use a subset of the reflectance spectrum. A vast number of indices created for remote sensing applications are now used in proximal phenotyping studies. NDVI is a very popular index applied in several drought studies (Kim et al., 2011;Römer et al., 2012;Behmann et al., 2014;Sun et al., 2018). Its performance ranges from no drought-induced differences to a significant reduction in NDVI values. Behmann et al. (2014) observed that NDVI has a higher relevance in the senescence stages of barley, suggesting that this index is less suitable for early drought detection when drought does not affect pigment content and leaf senescence. This was confirmed here, because the NDVI only showed significant effects from day 4 of the drought treatment onwards, while the RGRI detected effects by the second day. In maize, drought treatments cause an increase in RGRI values (Sun et al., 2018), which can be attributed to changes in photosynthetic efficiency and/or pigment content. Because a mild drought was applied during these experiments, no clear differences in pigment content were observed, while photosynthetic efficiency (F v /F m and ps2 ) was significantly reduced and strongly correlated with the RGRI (Figure 5). The RGRI was also the only published index that correlated with one of the traits it was originally developed for. In addition, two WC indices (Relative Water Content index and R1451/1263) showed a relationship with and correlated with the plant water status. All existing WC indices were outperformed by the new water potential (WPI2) and water content (WCI) indices (R 2 of 0.92 and 0.73, respectively). These indices combined water absorption wavelengths with red reflection, which was the most drought-sensitive wavelength. The poor performance of published WC indices has also been observed by Yi et al. (2013) and may be attributed to the influence of multiple structural and physiological traits that affect their transferability to other species, genotypes, developmental stages and experimental setups. In addition, many indices were created for remote sensing applications, where reflectance is affected by non-biological factors that do not influence reflectance in proximal sensing.
Calculating indices is the simplest method to analyze hyperspectral data, but it utilizes only a small subset of the available wavelengths with a potential loss of biological information. Multivariate techniques, such as PLSR, provide an alternative where the whole-plant spectrum is used by reducing the dimensionality using principal components. PLSR models have been created in several drought studies to predict water use traits, such as , WC, and g s (Rapaport et al., 2015;Ge et al., 2016;Corti et al., 2017;Pandey et al., 2017;El-Hendawy et al., 2019). Three studies have compared the accuracy of index-based and PLSR models, demonstrating that these multivariate models performed equally or better than published indices (Hansen and Schjoerring, 2003;Atzberger et al., 2010;Heckmann et al., 2017;Yendrek et al., 2017;Ge et al., 2019). In this study, both slight accuracy improvements (7-13 %) and deteriorations (5-24 %) were observed when index-based models were compared to PLSR models. The similarity in the prediction accuracy of both methods can be explained by the fact that they use similar wavelength regions. The index-based or PLSR models with the highest prediction accuracy for the physiological traits , F v /F m , PS2 (R 2 WPI2_index = 0.92, R 2 PLSR = 0.88, R 2 PLSR = 0.86, respectively), incorporated red-edge and red reflection. These two wavelength regions were complemented by reflectance of the SWIR water absorption trough (1,457 nm) in the indexbased model. The importance of the red-edge and the SWIR water absorption trough for predictions has been confirmed by the grapevine case study of Rapaport et al. (2015), which suggests that these wavelengths may be related to across different plant species. The importance of rededge wavelengths in the prediction of drought-induced changes in physiology was not limited to , F v /F m , and PS2 , as this region received high VIP values in many physiological PLSR models, such as for g s , CO2 , and WC. High rededge VIP values have also been observed in the WC PLSR models of Corti et al. (2017). Only the E PLSR model did not incorporate the red-edge in its top 30 wavelengths. To predict E, NIR, and SWIR reflectance plays a more prominent role. These wavelength regions have been related to WC and leaf internal structure, illustrating their importance in monitoring plant water use behavior. The usability of PLSR models for reviewing physiological traits was demonstrated in several studies, but the transferability of PLSR and index-based models to other studies or datasets is uncertain. Vigneau et al. (2011) observed that PLSR models were dataset-dependent and could not be applied on plants grown in a different environment. Fu et al. (2019) went further in showing that stacking regression models improved predictions of hyperspectral reflectance with photosynthetic capacities in field-grown tobacco over any individual model, including those based on PLSR. Obviously more robust models can be created in other ways such as using multiple genotypes or species, growing conditions, sensors and developmental stages.

CONCLUSION AND FUTURE PERSPECTIVES
Hyperspectral imaging is now a demonstrated tool for plant phenotyping in automated platforms. In this case study, hyperspectral imaging was able to resolve diurnal changes in physiological traits, such as E, and detected the interaction between diurnal and drought-induced changes in plant physiology, which may mask drought effects. The imaging system was also able to satisfactorily monitor drought-induced changes in and photosynthetic efficiency with both indexbased and PLSR models. The PLSR models performed similar or better than the index-based models for most of the physiological traits, as long as a decent correlation between physiology and reflectance was present. However, indices do not always require the development of models as they can also be used to investigate relative differences between treatments. The results observed in this study were only obtainable after correcting the additional illumination-induced reflectance variation, which was accomplished by performing a light classification. This pre-processing approach requires no additional information making it a more easily accessible method to remove illumination effects. Nevertheless, illumination is not the only factor; plant development, growing conditions, genotype and species will also influence reflectance by creating biological variation that may not have been accounted for during the establishment of the hyperspectral processing protocols. Development affected both reflectance and physiology in this case study, resulting in greater treatment effects across many wavelengths. These development effects will become more pronounced when plants are monitored during their whole development. To our current knowledge, no information of the effects of plant development on reflectance and its relation to physiology has been published. A better understanding of the influence of these different factors on reflectance and its relationship with physiology is crucial for the development of robust indices and multivariate models that can be used in plant phenotyping and screening of drought-tolerant varieties. The plant reflectance spectrum contains a vast amount of information about physiological and structural plant traits, such as leaf internal structure, leaf thickness, pigment and WC. Current knowledge about the causal relationships between reflectance and plant traits is limited and insufficient to explain the more complex interactions between reflectance and physiology that were observed in this case study. Several hypotheses were proposed to explain these relationships, but more research is needed to validate or refine these hypotheses.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
NW, DI, HN, SMa, GB, SC-B, JV, and WB conceived the original screening and research plans. NW supervised the experiments. LV, NW, and SMe performed the experiments, collected hyperspectral data and physiological measurements. KD, KM, BC, and JD provided technical assistance during experiments. HS and SMe analyzed the hyperspectral data. SMe performed statistics and wrote the manuscript. NW, DI, and HN revised the manuscript. DI agreed to serve as the author responsible for contact and ensures communication. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by the Hercules Foundation (ZW1101) and the "Bijzonder Onderzoeksfonds Methusalem Project" (no. BOFMET2015000201) of Ghent University.

ACKNOWLEDGMENTS
We thank Dr. Veronique Storme for performing the mixed model approaches in SAS. We thank Dr. Annick Bleys for reading and preparing this manuscript for publication. We also wish to thank Drs. Liqun Xing, Erin Slabaugh, and Igor Kovalenko for critical review.