Rice Yield Estimation Using Parcel-Level Relative Spectral Variables From UAV-Based Hyperspectral Imagery

Time-series Vegetation Indices (VIs) are usually used for estimating grain yield. However, multi-temporal VIs may be affected by different background, illumination, and atmospheric conditions, so the absolute differences among time-series VIs may include the effects induced from external conditions in addition to vegetation changes, which will pose a negative effect on the accuracy of crop yield estimation. Therefore, in this study, the parcel-based relative vegetation index (ΔVI) and the parcel-based relative yield are proposed and further used to estimate rice yield. Hyperspectral images at key growth stages, including tillering stage, jointing stage, booting stage, heading stage, filling stage, and ripening stage, as well as rice yield, were obtained with Rikola hyperspectral imager mounted on Unmanned Aerial Vehicle (UAV) in 2017 growing season. Three types of parcel-level relative vegetation indices, including Relative Normalized Difference Vegetation Index (RNDVI), Relative Ratio Vegetation Index (RRVI), and Relative Difference Vegetation Index (RDVI) are created by using all possible two-band combinations of discrete channels from 500 to 900 nm. The optimal VI type and its band combinations at different growth stages are identified for rice yield estimation. Furthermore, the optimal combinations of different growth stages for yield estimation are determined by F-test and validated using leave-one-out cross validation (LOOCV) method. The comparison results show that, for the single-growth-stage model, RNDVI[880,712] at booting stage has the best correlation with rice yield with a R2-value of 0.75. For the multiple-growth-stage model, RNDVI[808,744] at jointing stage, RNDVI[880,712] at booting stage and RNDVI[808,744] at filling stage gain a higher R2-value of 0.83 with the mean absolute percentage error of estimated rice yield of 3%. The study demonstrates that the proposed method with parcel-level relative vegetation indices and relative yield can achieve higher yield estimation accuracy because it can make full use of the advantage that remote sensing can monitor relative changes accurately. The new method will further enrich the technology system for crop yield estimation based on remotely sensed data.


INTRODUCTION
Remote sensing technology is an important measure for collecting data on the Earth and its changes, and it has been widely used in all kinds of subjects such as water resources (Schneider et al., 2018), geology (Govil et al., 2017), ecology (Echappé et al., 2018), and agriculture (Guo et al., 2018). Remote sensing for agriculture has the advantages of nondestructive, non-invasive, fast, and cost-efficient monitoring, well-correlated with agronomical and important physiological crop traits (Reynolds et al., 2015). It has usually been applied to monitor crop growth status (Poenaru et al., 2017), to map vegetation area (Xiao et al., 2005), to estimate crop yield (Peng et al., 2014), etc. In recent years, those issues such as the environmental degradation and water pollution have caused the reduction of arable land and grain production, grain security has become a tremendous challenge in many countries and regions. So accurate grain yield estimation is a significant means to ensure grain security and agriculture production.
Although satellite images have been widely used in crop yield estimation, there are still many problems that we need to address. For example, high spatial and temporal resolution satellite images usually cannot be obtained at same time, sometimes it is impossible to obtain valid data at the key growth stage due to poor weather conditions. Unmanned aerial vehicles (UAV) equipped with sensors can remedy those defects mentioned above (Zhou et al., 2017). UAV remote sensing is a low altitude remote sensing system, which can acquire high spatial-temporal resolution remotely sensed data on demand. It has been used for agriculture monitoring in sugarcane (Luna and Lobo, 2016), sunflower (Vega et al., 2015), soybean (Yu et al., 2016), and triticale (Noack, 2016), yield prediction in rice (Zhou et al., 2017), wheat (Du and Noguchi, 2017) and barley (Honkavaara et al., 2013). At present, remote sensing sensors mounted on UAV include RGB digital camera (Luna and Lobo, 2016;Du and Noguchi, 2017;Zhou et al., 2017), NIR camera (Nebiker et al., 2016) and multispectral camera (Vega et al., 2015;Yu et al., 2016;Noack, 2016;Zhou et al., 2017). In recent years, with the development of imaging hyperspectral technology, imaging hyperspectral cameras have gradually been equipped on UAV to acquire remotely sensed data combining image with spectra (Honkavaara et al., 2012(Honkavaara et al., , 2013Yue et al., 2017). Imaging hyperspectral technology can obtain more spectral bands and precise spectral information, which is expected to further improve the monitoring accuracy. Hyperspectral images have been used for measuring individual parcel plots using ultra-high spatial resolution up to 1 cm per pixel (Turner et al., 2012), mapping high-precision leaf carotenoid concentration of vine in region scale (Zarcotejada et al., 2013), monitoring soybean LAI precisely by combining hyperspectral image with artificial neural network (ANN) (Yuan et al., 2017), early detection of olive verticillium using airborne hyperspectral and thermal imagery (Calderón et al., 2013).
Unmanned Aerial Vehicle platform equipped with remote sensing sensors can supply high spatial and temporal resolution images for precision agriculture, but there are still some issues that need be further investigated. For example, accurate radiometric correction for time-series images is a challenge for UAV remote sensing. The ideal condition for multi-period UAV data acquisition is that the weather is clear and the angle of the sun is the same for each flight. Actually, for each flight, the absorption and scattering of solar radiation by atmospheric molecules, dust and water vapor are different and the altitude angle of the sun is not the same due to the different flying time (Wang et al., 2003), which make the measured values of the sensor inconsistent with physical quantities, such as spectral reflectance or spectral radiance of the target (Wang et al., 2003;Roy et al., 2010). For example, the different compositions of water vapor, ozone, and aerosol in the atmosphere will affect the reflection of the Red band and the NIR near-infrared band (Teillet et al., 2000), which make radiometric corrections for time-series images more difficult. In addition, different soil backgrounds may also affect target reflections. Therefore, eliminating the interference of atmospheric and illumination conditions on remote sensing images and obtaining accurate canopy reflection data is the basis for predicting crop yield accurately. Radiometric calibration is an essential process for UAV remote sensing, in which the reflectance spectra are calculated by using reference white or gray board (Hall et al., 1991;Ding and Elvidge, 1996;Carvalho et al., 2013) and then vegetation indices at different growth stages are derived for yield estimation. However, due to the influence of atmospheric and illumination conditions in different dates, it is difficult to obtain absolutely accurate reflectance using Pseudo-Invariant Features (PIF) method (Carvalho et al., 2013) or standard reference calibration board method (Du et al., 2001), although these methods can mitigate these effects in some way. In fact, it is impossible to obtain absolutely accurate reflectance spectra, but it is easy to characterize the relative change of radiation variable for remote sensing technology. This is because the two variables used for calculating the relative changes are obtained under the same external conditions, such as atmospheric and illumination conditions, even background conditions, so the relative variable can eliminate the effects of these external conditions on target reflectance. Therefore, this study attempts to construct a parcel-level relative spectral value to weaken the negative effects from different external conditions, and estimate rice yield based parcel-level relative spectral values from multiple-growth-stage hyperspectral images. The purposes of this study are (1) to propose a crop yield estimation method based parcel-level relative spectral value, which can eliminate the effects of atmospheric and illumination conditions and rice field background on the target reflectance, (2) to make full use of the strength of hyperspectral image in band numbers to determine the optimal band combinations for rice yield estimation, (3) to determine the best combination of growth stages for rice yield estimation.

UAV-Based Hyperspectral Images Acquisition and Data Processing
Rikola Hyperspectral Imager (Rikola Ltd., Oulu, Finland) mounted on the DJI Matrice 600 Pro UAV (Figure 3) are used to obtain hyperspectral images at different growth stages. The UAV maximum flight altitude is 2.5 km and payload weight is 6 kg. The Rikola hyperspectral imager can capture two-dimensional hyperspectral images ranging from 500 to 900 nm with 62 bands in total, and the full width at the half maximum of the spectral band is 8 nm. Hyperspectral sensor exposure times were set ranging from 10 to 15 ms depending on sunlight conditions. Hyperspectral images were obtained continuously and saved to memory cards when UAV was flying. For each band, a 1010 by 1010 pixels image with 12 bit (4096 DN) is created. All pixels are true image pixels, no interpolation is used. Flight altitude was set to 200 m and flight route was fixed for each UAV flight campaign with flying time of 10 min. UAV-based hyperspectral images were obtained at rice tillering stage, jointing stage, booting stage, heading stage, filling stage, and ripening stage from 28 July to 24 October (Table 1).
Every hyperspectral images were processed with vignetting correction, lens distortion correction, black level calibration and band to band registration. DN values of raw images were transformed to radiance values. Except for band to band registration, all data processing was carried out with Rikola Hyperspectral Imager software (Rikola Ltd., Oulu, Finland).

Parcel-Level Relative VI and Relative Yield
As mentioned above, the use of relative vegetation index is expected to diminish the limitation of absolute differences that contain the uncertain information caused by different background, illumination and atmospheric conditions at different growth stages. The idea of parcel-level relative vegetation index is based on the hypothesis that solar radiation, solar altitude and atmospheric conditions are the same, and background are similar at one time of data acquisition. So, the one parcel of rice field is selected as reference parcel and others are seen as studied parcels. The parcel-level relative vegetation index and relative yield is calculated on the basis of vegetation index and yield of reference parcel. In our study, a well-grown parcel with normal fertilization is selected as a reference parcel from all experimental parcels. Relative vegetation index is calculated by vegetation index of studied parcel divided by vegetation index of the reference parcel [Eq. (1)], and relative yield is calculated by yield of studied parcel divided by yield of reference parcel [Eq.
(2)]. Schematic plot of the calculation method is shown in Figure 4.
In our study, three types of relative vegetation indices are tested, including Relative Normalized Difference Vegetation Index (RNDVI), Relative Ratio Vegetation Index (RRVI), and Relative Difference Vegetation Index (RDVI). All possible two-band combinations of discrete channels from 500 to 900 nm are used to create RNDVI, RRVI, RDVI, as demonstrated by Le Maire et al. (2008) [28], and linear regressions between those combinations and relative yields are performed to determine the correlation coefficient (r). All the r-values are plotted in matrix plots. The advantages of the matrix plots are that they give a quick overview of hundreds of wavelength combinations and make it possible to identify the optimal band combination for further analysis.
Where RY is parcel-level relative yield, Y is the measured yield of a study parcel, and Y R is the measured yield of reference parcel.

Yield Estimation Model
Since the good linear relationship between VI and crop yield, like rice and wheat, have been found in the previous studies (Tucker et al., 1980;Siyal et al., 2015), multiple linear regression method will be used to construct the rice yield estimation models in this study. The independent variables are depended on analysis results of the optimal band combinations above, and dependent variable is relative yield.
The relative VIs at different growth stages are all related with relative yield, but which single growth stage and growth stage combinations that can more accurately estimate rice yield will be determined through this study. Therefore, firstly, the optimal single-growth-stage model can be directly obtained by comparing the fitted model with different single growth stage. Secondly, the multiple-growth-stage model can be acquired by comparing the fitted model with two, three, four, five, and all combinations of growth stages. Coefficient of determination (R 2 ) and root mean square error (RMSE) between estimated and measured yield are calculated to quantify the performance of estimation models.

Band Selection for Relative VI and Determine the Optimal Relative VI Type
The illustrations of r between the vegetation indices of different growth stages and relative yield are shown in Figure 5. As seen for all growth stages, band combinations provided by wavelength 1 at red edge spectral region combined with wavelength 2 in nearinfrared area almost occupy high r-values areas. The locations of wavelength 1 and wavelength 2 of the optimal band combinations with the highest r-value for three types index and for every growth stage are shown in Table 1. It can be seen that the red edge bands from 712 to 744 nm and the near-infrared regions from 808 to 888 nm occur in all three vegetation types. These results indicate the great potential of red edge band and near-infrared regions for the estimation of rice yield.
The comparison of the optimal band combinations at every growth stage for all three types index, i.e., RNDVI type index, RRVI type index, and RDVI type index, are shown in Figure 5. As illustrated, compared with RRVI and RDVI type index, RNDVI type index is the most efficient type (Figure 6). The optimal relative vegetation indices of every growth stage are all RNDVI type index. In addition, the correlation coefficients show an increased tendency from tillering stage to booting stage and then decrease to ripening stage. The highest correlation coefficients between optimal relative vegetation index and relative yield for all three type index are always obtained at booting stage, with r of 0.87 for RNDVI [880,712] , 0.85 for RRVI [776,744] , and 0.84 for RDVI [840,744] . It can be concluded that booting stage is the most efficient growth stage for rice yield estimation.

Yield Estimation Model With Different Growth Stages Involved
Rice yield is related to the relative spectral variables at every growth stages. As concluded from above results, the estimation efficiency of RNDVI type index is the most effective compared with RRVI type index and RDVI type index, so in the process of model construction, only RNDVIs are considered. In order to identify the number of involved RNDVIs and the involved growth stages in the optimal yield estimation model, the univariate models with one single growth stage and the multivariate models including two, three, four, five, and all growth stages are tested using R 2 and RMSE.

Yield Estimation Model With One Single Growth Stage Involved
Yield estimation model with one single growth stage involved is the simplest yield estimation model, which has the advantages of simplicity and easy to calculate. The R 2 and RMSE of all estimation models with one single RNDVI at each growth stages are shown in Table 2.
As seen, compared with models using RNDVIs at other growth stages, the model using RNDVI at booting stage gains the highest R 2 of 0.75 and the lowest RMSE of 228.04 kg/ha. So the booting stage is regarded as the optimal growth stage for rice yield estimation. The model can be expressed as: where Y E is estimated Yield, Y R is reference yield.

Yield Estimation Model With Two Growth Stage Involved
The models including two RNDVIs of any two growth stages are shown in Table 3.
As seen, the best estimation model with R 2 of 0.80 and RMSE of 205.74 kg/ha, include booting stage (RNDVI [880,712] ) and filling stage (RNDVI [808,744] ). The expression of model is:

Yield Estimation Model With Three Growth Stages Involved
The model expressions and R 2 and RMSE of all three-variate models are shown in Table 4. As seen, the highest R 2 of 0.83  x 1 is RNDVI [720,880] at tillering stage, x 2 is RNDVI [744,808] at jointing stage, x 3 is RNDVI [712,880] at booting stage, x 4 is RNDVI [736,888] at heading stage, x 5 is RNDVI [744,808] at filling stage, x 6 is RNDVI [744,872]

Yield Estimation With All Grow Stage Combination
Yield estimation model including all optimal RNDVIs of every growth stage is expressed as Eq. (8)

F-Test for Optimal Model Selection
As can be seen in Table 7 and Figure 7, with the excessive number of RNDVIs including in the models, R 2 -values do not increase significantly while RMSE values don't decrease significantly. Yield estimation model would be complex if too many RNDVIs are included in the estimation model, but a univariate model using one RNDVI would decrease the estimation accuracy. In order to identify the optimal yield estimation model with greatest simplicity and accuracy simultaneously, all optimal multivariate models were tested using F-test. The optimal estimation model would be selected from the models that with F-test reaching 1% significant level and simultaneously with lower RMSE and less RNDVIs included. As can be seen in Table 7, F-test of all models reach 1% significant level. Then, taking into consideration R 2 and RMSE, the multivariate model including three RNDVIs, with R 2 of 0.83 and RMSE of 189.13 kg/ha, is identified as the optimal estimation model. The equation of selected model is expressed in Eq. (5).

Model Validation
The estimation model Eq. (5) is validated using leave-oneout cross validation (LOOCV) method. The agreement between estimated and measured yield is tested using RMSE. The scatter plot of estimated and measured yield are shown in Figure 8. Diagonal lines are the 1:1 line. Ideally, when points are located on the 1:1 line, it should be a perfect match. As can be seen, all the estimated yields are acceptable and a low average relative error of 3%, and an RMSE of 215.08 kg/ha are obtained.

DISCUSSION
Grain yield estimations with UAV-based remotely sensed data have been carried out for many types of crops. Stroppiana et al. (2015) used UAV-derived NDVI at one single growth stage to construct rice yield estimation in northern Italy and with R 2 of 0.5. Zhou et al. (2017) also constructed rice yield estimation models with UAV-derived NDVI at one single growth stage and NDVIs at two or more grows stages, with R 2 of 0.75 for univariate model and 0.76 for multivariate model, respectively. Compared with those researches, our study presents the idea of relative vegetation index and relative yield to improve the rice yield estimation accuracy. As the results have shown, the univariate model using RNDVI at booting stage obtained a R 2 of 0.75 and the multivariate model including three RNDVIs, i.e., RNDVI [808,744] at jointing stage, RNDVI [880,712] at booting stage and RNDVI [808,744] at filling stage, gained a R 2 of 0.83. The results demonstrated that when using relative vegetation index and the relative yield, a high accuracy of yield estimation can be obtained. In this study, the RNDVI at booting stage shows the highest correlation with rice yield. This conclusion is consistent with previous studies (Chang et al., 2005;Zhou et al., 2017), in which booting stage was regarded as the optimal growth stage for rice yield estimation. Booting stage is a transitional period from vegetative growth to reproductive growth and can reflect yield potential (Zhou et al., 2017). Combining booting stage with other growth stages can further improve model estimation accuracy. The great potential of booting stage used for rice yield estimation is also proved in present study. As seen in Table 7, the optimum univariate model and multivariate models all included the RNDVIs at booting stage.
The large number of narrow spectral bands results in a high inter-correlation between them (Darvishzadeh et al., 2008;Yue et al., 2017). In our study, the correlation between   relative VI from all possible two-band combinations and relative yield are used to identify the optimal band combination to avoid the band autocorrelation and redundancy. The optimum relative vegetation indices are mostly composed by the bands located at red edge region (712, 720, 736, and 744 nm) and near infrared region (808, 872, and 888 nm). These results are consistent with the previous study (Ren et al., 2011;Fu et al., 2014;Verger et al., 2014). The bands in the red edge region are sensitive to the vegetation changes while the bands in near infrared region are sensitive to plant grow and can indicate the health status of leaves.
In this study, we conducted a 1-year rice experiment with two rice varieties and five nitrogen levels at a specific region. The results obtained in this study may be limited by rice varieties and research regions. In addition, due to the limitation of data volume, no systematic comparison between the models with the relative spectral index and with the traditional spectral index was made. So, this method needs to be further validated by using multi-year, multi-region, and multi-variety data.

CONCLUSION
The study proposed a rice yield estimation method with the parcel-level relative vegetation index as input variables to overcome the external effects, such as different background, illumination and atmospheric conditions, on the absolute differences of time-series vegetation indices. Three types of relative vegetation indices were constructed using all possible two-band combinations of discrete channels from 500 to 900 nm, including RRVI, RNDVI, and RDVI. The main conclusions drawn from this study are: (1) RNDVI is the optimal type of relative vegetation index for rice yield estimation compared with RRVI and RDVI, and the correlation between RNDVIs and rice yield are significantly higher than RRVIs vs. yield and RDVI vs. yield; (2) the optimal RNDVIs at different growth stages are generally composed by bands in red edge and near infrared regions; (3) the booting stage is the optimum stage for yield estimation when estimation model constructed by RNDVI at one single growth stage; (4) the multiple-growth-stage model, including three RNDVIs, i.e., RNDVI [808,744] at jointing stage, RNDVI [880,712] at booting stage and RNDVI [808,744] at filling stage was determined as the optimum model among the multivariate models that include RNDVIs at different growth stages. The promising ability of parcel-level relative vegetation indices for rice yield estimation is proven in this study, however, the further applications of relative vegetation indices on rice or other crop yield estimation are still needed for a solid conclusion using more measurements. This will be carried out in our future work.