Dynamic monitoring of maize grain quality based on remote sensing data

Remote sensing data have been widely used to monitor crop development, grain yield, and quality, while precise monitoring of quality traits, especially grain starch and oil contents considering meteorological elements, still needs to be improved. In this study, the field experiment with different sowing time, i.e., 8 June, 18 June, 28 June, and 8 July, was conducted in 2018–2020. The scalable annual and inter-annual quality prediction model for summer maize in different growth periods was established using hierarchical linear modeling (HLM), which combined hyperspectral and meteorological data. Compared with the multiple linear regression (MLR) using vegetation indices (VIs), the prediction accuracy of HLM was obviously improved with the highest R 2, root mean square error (RMSE), and mean absolute error (MAE) values of 0.90, 0.10, and 0.08, respectively (grain starch content (GSC)); 0.87, 0.10, and 0.08, respectively (grain protein content (GPC)); and 0.74, 0.13, and 0.10, respectively (grain oil content (GOC)). In addition, the combination of the tasseling, grain-filling, and maturity stages further improved the predictive power for GSC (R 2 = 0.96). The combination of the grain-filling and maturity stages further improved the predictive power for GPC (R 2 = 0.90). The prediction accuracy developed in the combination of the jointing and tasseling stages for GOC (R 2 = 0.85). The results also showed that meteorological factors, especially precipitation, had a great influence on grain quality monitoring. Our study provided a new idea for crop quality monitoring by remote sensing.

In addition, maize can be used as a raw material for food and industrial processing to produce oil, starch, sweeteners, and industrial alcohol (FAO, 2020). All the above applications pose challenges to maize quality. However, previous studies have largely focused on grain yield prediction and the response of crop growth to biotic and abiotic stresses (Vaughan et al., 2018;Pradawet et al., 2022;Shuai and Basso, 2022) studies related to maize grain quality, especially quality parameters other than protein content, have yet to be given due attention (Nuttall et al., 2017).
Accurate and timely regional crop growth monitoring and grain quality evaluation as early as possible are important directions of field management. However, the traditional way for assessing grain quality, i.e., biochemical test, though accurate, is limited by many issues, such as low efficiency, weak regional representation, and high cost. Thus, remote sensing technology and hyperspectral analysis have been widely applied to crop production estimation because of their advantages in terms of high-throughput, non-destructive, and prospective monitoring (Rodrigues et al., 2018;Xu et al., 2020;Ma et al., 2022). Crop remote sensing monitoring is realized by obtaining electromagnetic wave reflection information from the canopy through a passive sensor (Ma et al., 2022). The crop traits can be inferred by vegetation index (VI) output from remote sensing images in the visible, near-infrared, and short-wave infrared bands (Khanal et al., 2017). Moreover, the remote sensing data that indicate the growth and vitality of crop canopy can provide necessary information for estimating maize grain quality. Cho and Kang (2020) suggested that glucose produced by plant photosynthesis can be polymerized into sucrose, which is transported to the grain as the main source of starch accumulation. VIs reflecting nitrogen and chlorophyll contents of the canopy are good indicators of crop photosynthetic state (Schlemmer et al., 2013), so they can indirectly represent the level of starch accumulation. Grain oil content (GOC) can also be inferred from plant nitrogen and carbohydrate accumulation (Ghafoor et al., 2021). Grain protein formation depends on nitrogen transport and water availability, and spectral index with high sensitivity to canopy water content and nitrogen status may reflect grain protein content (GPC) effectively (Zhao et al., 2005;Ma et al., 2022). However, VIs strongly related to maize grain quality, especially grain starch content (GSC) and GOC, are not clear. In addition, with significant differences discovered across growth stages in spectral characteristics of maize (Panigrahi and Das, 2018), the determination of the optimum VIs at different stages is of importance for grain quality assessment.
The spectral information obtained by remote sensing has been used to construct the evaluation model of grain quality in recent years, mainly for GSC and GPC. In terms of the GSC prediction, Tan et al. (2011) established a direct prediction model of GSC using the structure insensitive pigment index (SIPI). The accuracy of the indirect model of GSC based on leaf nitrogen content (LNC) and SIPI was 9.7% higher than that of the direct model. Zhao et al. (2005) found that VI derived from the canopy spectral reflectance at green and red bands was significantly correlated to the final GPC. Onoyama et al. (2011) used the normalized difference vegetation index (NDVI) with partial least squares regression (PLSR) analysis to estimate GPC clearly, but they also found that prediction error increased twice or more when one prediction model used data from other years. Obviously, in the process of crop growth, VIs often cannot fully simulate the crop grain quality, and the formation of grain quality is affected by many factors (Zhang et al., 2022). Environmental factors, photosynthesis, canopy structure, and nutrient uptake directly or indirectly affect crop growth and final grain quality (Asseng and Milroy, 2006;Ning et al., 2012). Weather information, crop growth state, and their interaction should be considered in the construction of grain quality prediction models (Butts-Wilmsmeyer et al., 2019).
Crop models were considered as a way to combine the effect of different factors. For example, the leaf area index and canopy nitrogen accumulation were used to correct the DSSAT-CERES crop model, achieving high-precision prediction of GPC (Li et al., 2015). However, too many variables (meteorology, soil, management, and other input data) and complex assimilation algorithm potentially reduce predictive power and limit the largescale application of crop models (Jin et al., 2018). Therefore, it is a new method to construct the prediction model of grain quality by synthesizing the key indices with a relatively simple algorithm. Li et al. (2020) developed an inter-annual expandable grain quality prediction model using hierarchical linear modeling (HLM), an inter-annual expandable prediction model considering the interaction between remote sensing data and meteorological data with high accuracy. HLM can incorporate hierarchical data, including nested data structures, combining multiple levels and multiple variables together (Wilson et al., 2011). In HLM, there are usually two or three levels to interpret the results of different independent variables. HLM can, therefore, not only solve regional and inter-annual crop production changes but also analyze the relationship between crop quality and remote sensing information in different regions and growth stages (Xu et al., 2020). However, the application of HLM on maize quality especially grain starch and oil prediction still needs to be improved.
Grain quality composition is the comprehensive result of various factors in crop growth stages (Dente et al., 2008;Wang et al., 2018). The factors are first reflected in the changes in growth trends with different growth periods (Li et al., 2014;Koca and Erekul, 2016) and then transformed into effects on quality. In other words, the actual growth of crops at different growth stages also contains a large amount of grain quality information (Xie et al., 2020). Therefore, based on field test data of summer maize at different sowing dates (8 June, 18 June, 28 June, and 8 July) in Gucheng Agricultural Meteorology National Observation and Research Station during 2018-2020, this study constructed monitoring models for maize grain quality (starch, protein, and oil) by HLM coupled with critical hyperspectral remote sensing data and meteorological information at different growth stages. Then, combined with different growth periods, multi-phase monitoring models were established to provide a reference for improving the monitoring capacity of grain quality.

Field site description and experimental design
The study was conducted at Gucheng Agricultural Meteorology National Observation and Research Station (39°08′N, 115°40′E), Dingxing County, Baoding City, Hebei Province, China (Figure 1). The region is located in the central part of North China, with a temperate monsoon climate. The annual mean temperature, annual mean precipitation, and annual sunshine were 12.2°C, 494 mm, and 2,403.6 h, respectively. Maize plantation is in a rotation with a previous crop of wheat in this region and surrounding area, sown in June and harvested by September-early October each year. The soil in the experimental site is characterized as sandy loam with total nitrogen at 87.00 mg kg −1 , total phosphorus at 25.67 mg kg −1 , total potassium at 118.55 mg kg −1 , organic matter content at 13.67 g kg −1 , and pH at 8.1 within a depth of 50 cm in average.
The maize hybrid Lianyu 1 was sown on 8 June (S1), 18 June (S2), 28 June (S3), and 8 July (S4) in 2018-2020. A randomized complete block design with three replications was used to arrange the treatments ( Figure 2). The observation time of each growth stage, i.e., jointing, tasseling, grain-filling, and maturation, under different treatments, is shown in Table 1. Maize was planted at a density of 52 plants per plot (65,000 plants hm −2 ), with a plot dimension of 4 m × 5 m (an area of 20 m 2 ). The guard row was set around the perimeter of the experimental field with a width of 2 m.
Field management, including irrigation, fertilization, and pest and weed control, was carried out according to local practices.

Canopy hyperspectral reflectance data
Canopy hyperspectral reflectance of each plot at the jointing, tasseling, grain-filling, and maturation stages of maize was collected using an ASD FieldSpec3 Spectrometer (Analytical Spectral Devices, Inc., Boulder, CO, USA). The spectral wavelength range of 350-2,500 nm was obtained by the sensor with sampling intervals of 1.38 nm for 350-1,050 nm and 2.0 nm for 1,000-2,500 nm. The spectrometer was located approximately 1.0 m above the canopy and perpendicular to the ground with a view angle of 25°so that only the plant canopy could be seen, minimizing the noise of the soil background. In addition, all observations were made under clear sky conditions between 11:30 and 14:00 for adequate light intensity. The canopy reflectance for each plot was calculated using the average of the five spectra, which was calibrated using a 0.4 m × 0.4 m BaSO 4 calibration panel before and after measurements. The time of field canopy spectral measurements under different treatments over the years was consistent with the observation time of each growth period shown in Table 1.

Meteorological data
Meteorological data, including daily mean temperature (°C), daily precipitation (mm), and daily solar radiation (MJ m −2 ) during the summer maize growth season from 2018 to 2020, were obtained from an on-site automated weather station in Gucheng. For each phase, i.e., jointing, tasseling, grain-filling, and maturation, the effective accumulated temperature (AT), total precipitation (Pre), and total solar radiation (Rad) mean the sum of daily mean Location of study area in Dingxing County, Baoding City, Hebei Province, China. The split plot design for four sowing dates (i.e., S1, S2, S3 and S4) of maize in 2018-2020. R means repetition. temperature (≥10°C), daily precipitation, and daily solar radiation from sowing to that growth period, respectively.

Grain quality analyses
Maize grain quality including starch, protein, and oil contents (%) was measured after harvest. GSC of dry samples was analyzed using the acid hydrolysis method in accordance with the Chinese Standard GB/T 5009.9-2016 (2016). GPC and GOC were evaluated by the macro Kjeldahl method and Soxhlet extractor method according to the National Standard of China numbered GB/T 5009.5-2016 and GB/T 5009.6-2016, respectively.

Selection of vegetation indices
According to the previous research, 15 VIs were considered to be potential indicators of grain quality (Table 2) to analyze the correlation between VIs and grain quality. Then, the optimal VIs were selected to construct the grain quality model.  The k-nearest neighbor (kNN) method was applied to impute the missing value for the sample at the tasseling stage under S2 treatment in 2018. To impute a missing value for a target sample, find the k most similar samples according to the defined distance measure (calculated using VI values that exist in the target sample and candidate neighbor sample) (Kim et al., 2005).

Multiple linear regression
First, a multiple linear regression equation (Eq. 1) was used to establish a grain quality prediction model based on the vegetation index: where G means measured grain quality, and VI 1 and VI 2 mean the two vegetation indices with the strongest correlation with grain quality under different phases. In addition, k, k 1 , and k 2 represent the constant terms and the parameter terms for the corresponding variable of the linear model, and ϵ is the error term.

Gray relation analysis
Gray relation analysis (GRA) was used for indefinite relation between characteristic variables (i.e., GSC, GPC, and GOC) and independent variables (i.e., VIs, AT, Pre, and Rad). The main procedure of GRA is first to normalize the values of all variables to produce a comparable sequence, which is called gray relation generation. From these sequences, the reference sequence (characteristic variable) is defined. Then, the gray relation coefficients between all comparable sequences and reference sequences are calculated. Finally, based on these gray relation coefficients, the gray relation level represented by the relevant degree (RD) between the reference sequence and each comparable sequence is calculated.

Hierarchical linear modeling
Furthermore, in order to consider the influence of inter-annual environmental factors on grain quality, HLM is a least squares regression analysis method that considers the nested structure of data, which has begun to be applied to yield and the GPC prediction in recent years (Li et al., 2020;Xu et al., 2020). The GSC and GOC are also affected by nitrogen and carbon metabolism, which can be reflected by the maize canopy spectrum (Yue et al., 2022;Li et al., 2023). Thus, HLM was applied to estimate the starch and oil contents in this study. HLM can stratify the dataset to comprehensively analyze the relationship between the data within the layer (the first-layer model) and the data outside the layer (the second-layer model) in view of the independence between the data. In this study, the first-layer model (L1) was a quality prediction model from the canopy spectra: L1: G=b 0j +b 1j ·VI 1 +b 2j ·VI 2 +r ij where b 0j , b 1j , and b 2j represent the intercept and coefficients of VI 1 and VI 2 under different phases in L1, respectively, and r ij represents the random error. The second layer model (L2) is based on the normalized value of the model coefficient in the L1 and the external meteorological data (AT, Pre, and Rad), as follows: where b nj means the intercept and slope of L1; g n0 is the intercept of L2; and g n1 , g n2 , and g n3 are the model coefficients of AT, Rad, and Pre in L2, respectively; m nj is the random error.

Combination of prediction model under different phases
In this research, the stepwise multiple regression method was used for the combination of prediction models under different phases to improve prediction power. In order to avoid the possibility of multicollinearity among independent variables, we first tested the variance inflation factor (VIF) of each variable in the equation and then obtained the optimal regression equation by stepwise multiple regression analysis. Bidirectional elimination is applied to test the variables to be included or excluded at each step. From all the independent variables available for selection, the variables that have a significant impact on the dependent variable are selected to establish the regression equation, while variables with no significant effect on the dependent variable were not added to the equation. The equation was expressed as follows: G=b 0 +b 1 G 1 +b 2 G 2 +···+b n G n +ϵ (4) where b 0 is a constant; G 1 , G 2 , …, G n mean the optimal independent variables (predicted value under different phases) for the model; b 1 , b 2 , …, b n are the regression coefficients corresponding to the predicted value in different phases; ϵ is the error term.

Statistical analysis
Pearson's correlation coefficient (r) between vegetation indices and grain quality (GSC, GPC, and GOC) were analyzed using IBM SPSS Statistics 24.0 (IBM Corp., Armonk, NY, USA). In all prediction models, 80% of the dataset was the modeling set, 20% was the validation set, and the random state was set to 42 to ensure the consistency of dataset partitioning at different phases. To test the performance of different prediction models, the coefficient of determination (R 2 ) (Eq. 5), root mean square error (RMSE) (Eq. 6), and mean absolute error (MAE) (Eq. 7) were used as measures of accuracy.
where n is the number of observations; Y i and Y i ' are the ith measured and simulated data, respectively; Ŷis the mean value of measured data.
All statistical indicators were calculated using python 3.9 (Python Software Foundation, Portland, OR, USA), and all figures were drawn by Microsoft Office Excel (Microsoft Corporation, USA) and the ggplot package of R language (RStudio Inc., Boston, MA, USA).

Correlation between VIs and grain quality
The correlation between VIs calculated by the calibration dataset and grain quality under different phases is depicted in Figure 3. For GSC, only normalized pigment chlorophyll index (NPCI) was significantly correlated with GSC at the jointing stage. However, 60% VIs were significantly correlated with GSC at the tasseling and grain-filling stages. No significant correlation was found between VIs and GSC at the maturation stage. For GPC, VIs had no significant correlation with GPC at the jointing stage. Approximately 67% VIs were significantly correlated with GPC at the tasseling and grain-filling stages. In addition, 40% VIs showed a significant correlation with GPC at the maturation stage. In particular, VIs that showed a significant correlation with GSC and GPC at the tasseling and grain-filling stages had an extreme consistency. For GOC, VIs at the jointing stage had no significant correlation with GOC except normalized difference water index (NDWI). At the tasseling stage, 40% VIs were significantly correlated with GOC. At the grain-filling stage, 73% VIs were significantly correlated with GOC, while only 27% VIs were significantly correlated with GOC at the maturation stage.
The relationship between grain quality and the two VIs with the highest correlation at different growth stages can be seen in Table 3. VIs had no significant correlation with GSC, GPC, and GOC at the jointing stage (p< 0.01), except for NPCI with GSC (r = 0.37) and NDWI with GOC (r = 0.33). All selected best two VIs showed a highly significant correlation with GSC, GPC, and GOC at the tasseling and grain-filling stages (p< 0.01), with MERIS terrestrial chlorophyll index (MTCI) (r = 0.69), MTCI (r = 0.64), and NDWI (r = 0.55) showing the best correlation with GSC, GPC, and GOC, respectively, at the tasseling stage. However, Maccioni (r = 0.69), MTCI (r = 0.74), and ratio vegetation index (RVI) (r = 0.58) had a higher significant correlation (p< 0.01) with GSC, GPC, and GOC at the grain-filling stage. However, NDWI and NDVI had no significant correlation with GSC at the maturation stage, while RVI and Vogelmann red edge index 2 (VOG2) (r = 0.44 and −0.43, respectively) and optimized soil-adjusted vegetation index (OSAVI) and normalized difference red edge index (NDRE) (r = 0.41 and 0.39, respectively) showed a significant correlation with GPC and GOC, respectively.

Grain quality prediction model using MLR
The VIs shown in Table 3 were used to construct the prediction model under different phases. For the GSC, GPC, and GOC prediction, the bad R 2 was found in the jointing and maturation stages. The multiple linear regression (MLR) model for the GSC prediction had R 2 , RMSE, and MAE values of 0.49, 1.28%, and 0.84%, respectively, in the tasseling stage, and 0.53, 1.23%, and 0.91%, respectively, in the grain-filling stage (both using Maccioni and MTCI); the best prediction occurred in the tasseling stage because of the overfitting based on the validation set in the grainfilling stage. For the GPC prediction, the best accuracy was found in the grain-filling stage with R 2 , RMSE, and MAE values of 0.49, 0.56%, and 0.42%, respectively. For the GOC prediction, the accuracy of the MLR decreased further with an R 2 of 0.36 in the tasseling stage and 0.30 in the grain-filling stage (Table 4). For GSC and GPC, low prediction power was associated with a smaller fluctuation range of predicted value than the measured value, while it reversed for the GOC prediction (Figure 4). MLR model constructed by VIs cannot simulate the variability of factors other than crop canopy traits, leading to poor prediction accuracy.

Relevant degree between VIs, meteorological factors, and grain quality using GRA
Different from correlation analysis, which was used to determine the strength of linear relationships between factors, GRA can be used to comprehensively describe the importance of different factors to the target value. The results showed that meteorological factors, i.e., AT, Rad, and Pre, had a strong relation to GSC, especially the RD between accumulated precipitation, and GSC ranked first in all stages (RD = 0.71-0.77). The most noteworthy was the total precipitation during the maize growth period (sowing-maturation), giving the strongest relevant degree with GSC (RD = 0.75), followed by total accumulated temperature (RD = 0.61) and total radiation (RD = 0.60). Meteorological factors showed weak RD with GPC in all stages except for total precipitation from sowing to maturation (RD = 0.72, rank second). Compared to meteorological factors, VIs had a stronger relevant degree with GOC in all stages except the early growth stage (sowing-jointing) ( Table 5).

Grain quality prediction model using HLM
As shown in Table 5, meteorological factors had a correlation with grain quality, especially GSC. Therefore, meteorological factors should be taken into account in the grain quality prediction model using HLM. The two VIs that had the highest correlation with quality were also applied in the construction of the quality prediction model for comparison with MLR. The parameters in Table 6, i.e., g n1 , g n2 , and g n3 , explain the contribution of AT, Rad, and Pre, respectively. For the GSC prediction, the high accuracy of the model was realized in the middle and late growth stages with R 2 , RMSE, and MAE values of 0.90, 0.10, and 0.08, respectively (from sowing to tasseling); 0.85, 0.12, and 0.09, respectively (from sowing to grain-filling); and 0.85, 0.12, and 0.10, respectively (from sowing to maturation). For the GPC prediction, the model in the later growth stage performed better, with R 2 , RMSE, and MAE values of 0.84, 0.11, and 0.10, respectively (from sowing to grainfilling), and 0.87, 0.10, and 0.08, respectively (from sowing to maturation). The GSC and GPC models had certain similar rules, but the GOC model was completely different with the best R 2 , RMSE, and MAE values of 0.74, 0.13, and 0.10, respectively, at the early growth stage from the sowing to the jointing stage. The GOC model in the middle and late growth stages showed poor predictive ability (Table 7). Compared with the MLR method, the HLM method showed significant improvement in the prediction of GSC, GPC, and even GOC under different phases ( Figure 5). In addition, the HLM model generally performed well over the respective years (2018-2020), with R 2 varying from 0.79 to 0.84 and RMSE from 0.13 to 0.15 at the grainfilling stage for GSC estimation, and R 2 varying from 0.70 to 0.82 and RMSE from 0.14 to 0.17 at the grain-filling and maturation stages for GPC estimation. For the evaluation of GOC, the highest R 2 varied from 0.62 to 0.70 and the lowest RMSE from 0.16 to 0.19 at the jointing stage. In general, the prediction effect of HLM in 2018 and 2020 was slightly better than in 2019, with R 2 improved by 0.03 and 0.05 (GSC), 0.05 and 0.09 (GPC), and 0.08 and 0.04 (GOC) (Figure 6).

Combination of grain quality prediction model under different growth stages
To further improve the prediction accuracy of grain quality, we combined the predicted values using the HLM method under different growth stages through multiple stepwise regression equations to obtain the best models (Table 8). For the GSC prediction, the best model was produced by the combination of the predicted value under three stages, i.e., tasseling, grain-filling, and maturation stages, with R 2 , RMSE, and MAE values of 0.96,

Phases
Fixed effect Jointing  Table 3; the absolute value of the correlation coefficient of VI 1 is larger than that of VI 2 . GSC, grain starch content; GPC, grain protein content; GOC, grain oil content; HLM, hierarchical linear modeling; VIs, vegetation indices. 0.06, and 0.05, respectively. For the GPC and GOC prediction, the best model was the combination of two growth stages. The difference is that the best GPC prediction model was the combination of the grain-filling and maturation stages with R 2 , RMSE, and MAE values of 0.90, 0.09, and 0.07, respectively, while the best GOC prediction model was the combination of the jointing and tasseling stages with R 2 , RMSE, and MAE values of 0.85, 0.10, and 0.08, respectively.

Prediction model of GSC, GPC, and GOC
In this study, the VIs that significantly correlated with GSC and GPC were basically the same at the tasseling and grain-filling stages. Maccioni and MTCI showed the strongest correlation with GPC and GOC in all phases (Table 3). These results indicated that the spectral bands monitoring GSC and GPC were similar, and in addition, chlorophyll level had a great influence on GSC and GPC. With the growth process, the correlation between VIs and grain quality started to increase in the jointing stage, reached the plateau in the tasseling and grain-filling stages, and declined in the maturation stage. The high correlation between VIs and grain quality might be related to the accurate vegetation index caused by the fully extended leaves, high coverage, and mature canopy at the tasseling and grain-filling stages. However, a weak correlation was indicated at the jointing and maturation stages with low vegetation coverage and chlorophyll content (Xie et al., 2020). In addition, the correlation between VIs and GSC or GPC was significantly stronger than that between VIs and GOC, which was related to the fact that the VIs mainly reflected the information of canopy chlorophyll and nitrogen contents and could be used to predict GSC and GPC directly.
However, poor prediction accuracy is shown in the MLR model with only VIs (Table 4), which is also stated by Xue et al. (2007) and Chen (2020). VI, which represented the canopy state, is a key and direct indicator for grain yield prediction (Kuri et al., 2014), while an indirect relationship is known between VIs and grain quality. Grain quality is affected by many factors, such as hybrid characteristics, agronomic practices, and weather information (Gooding, 2017). Although the genotype has a great influence on the final grain quality composition, the temperature, water, and other conditions during the whole crop growth stage, especially the meteorological factors during the critical growth period, also play a certain role in the formation of grain quality (Butts-Wilmsmeyer et al., 2019). Therefore, the environmental conditions at different growth stages were fully considered in this study, and remote sensing information and meteorological data were combined to construct GSC, GPC, and GOC monitoring models for maize using the HLM method. Among all growth phases, the prediction effect improved obviously at maturity with R 2 increasing from 0.03 to 0.85 for GSC, from 0.14 to 0.87 for GPC, and from 0.21 to 0.50 for GOC. The best prediction accuracy during the whole growth period of each quality parameter, i.e., GSC, GPC, or GOC, also increased from 0. 51,0.49,and 0.36 to 0.90,0.87,and 0.74,respectively (Tables 4,7). The main reason for the improvement was that remote sensing and meteorological data were used as doublenested structure data, the reciprocal influences between crop growth and environmental information on grain quality were considered, intra-annual and inter-annual errors could be corrected (Figure 6), and the scalability of the model was enhanced in this study. Although crop simulation models are key tools for assessing the effects of environmental factors on crop growth and development, most models still cannot simulate grain quality directly except for protein content (Nuttall et al., 2017). The HLM method used in this study provides an initial opportunity to extend the predictive power of maize grain quality characteristics.

Importance of meteorological data in predicting grain quality
Starch and protein accumulation in maize kernels have been reported to be regulated by water and temperature in previous studies (Singletary et al., 1994;Correndo et al., 2021;Guo et al., 2022). The sucrose required for starch accumulation was regulated by photosynthesis, which was affected by solar radiation, temperature, precipitation, and other meteorological factors. The status of nitrogen required for protein synthesis was thought to change dynamically in soil, affected by temperature and moisture (Archontoulis et al., 2014). Ali et al. (2010) found that grain oil content was also affected by water stress. In summary, the influence of weather data was considered when predicting and evaluating grain quality in many studies. Butts-Wilmsmeyer et al. (2019) evaluated the correlation between weather and grain quality at different stages using principal component analysis and found that temperature, precipitation, and maize grain quality were significantly correlated. Li et al. (2020) predicted GPC successfully using four weather parameters, i.e., average daily solar radiation, daily maximum and minimum temperature, and total precipitation 1 month prior to anthesis. Jahangirlou et al. (2023) also estimated grain starch, protein, and oil contents using crop models and Boxplots comparing prediction performance of HLM under different years, i.e., 2018, 2019, and 2020, in different phases, i.e., jointing, tasseling, grain-filling, and maturation, by using R 2 and RMSE. HLM, hierarchical linear modeling; RMSE, root mean square error.
logistic equations with detailed temperature and precipitation elements during the maize growth season.
In this study, three main meteorological data, i.e., AT, Rad, and Pre, were also used as the intra-and inter-annual variations to evaluate the grain quality under different growth stages. A strong relation was shown between meteorological factors and GSC, especially the deepest RD expressed between Pre and GSC at each growth stage, followed by AT (Table 5). Because of this, the prediction accuracy of GSC was improved greatly by HLM instead of MLR at various growth stages, especially at the maturation stage (Table 7). The relationship between weather data and GPC was weaker than the relationship between weather data and GSC. In addition, the good RD between weather data and GOC was only performed in the sowing-jointing stage. A similar performance also occurred in the accuracy change of the prediction model. In general, weather data, especially precipitation data, should be used as the main input parameters in the GSC prediction in the future, the comprehensive utilization of canopy spectral information and meteorological data can be used for the GSC and GPC prediction effectively. For the GOC prediction, more information should be considered to further improve the prediction accuracy.

Combination of prediction model under different growth stages
For the prediction of GSC, GPC, and GOC, the highest prediction accuracy showed in different growth stages. The best prediction result of GSC appeared at the grain-filling stage, followed by the tasseling and maturation stages. However, overfitting appeared in the prediction with the validation set at the tasseling and maturation stages (R 2 = 0.61 and 0.69) ( Table 7). The best prediction result for GPC occurred at the grain-filling and maturation stages, while that for GOC occurred at the jointing stage. The rapid accumulation of starch and protein occurs from post-anthesis to grain-filling, while when starch deposition ceases, protein accumulation continues longer until near maturity (Vos, 1981), which may explain the difference in the optimal prediction period for GPC and GSC and the similar prediction accuracy for GPC at the grain-filling and maturation stages (R 2 = 0.84 and 0.87) (Table 7). However, the prediction results of GOC were completely different from those of GSC and GPC, which may be related to the high impact of meteorological data during an early stage. Unfortunately, most of the previous studies focused on the prediction under a single growth stage, and the combined prediction of multiple growth stages was scarce. We found that the accuracy of the prediction model under the combination of multiple growth stages, regardless of the modeling or validation set, was much higher than that under the single stage, indicating that the supplementary information of different growth stages was helpful to the monitoring of grain quality (Tables 7; 8). In particular, the best prediction model of GSC in this study was the combination of predicted value under three phases, i.e., tasseling, grain-filling, and maturation (Table 8). However, predicted values at the tasseling and maturation periods account for a large proportion through the model coefficients, which is consistent with the result of Xie et al. (2020) on the monitoring of starch content in rice; that is, heading and maturity are the most suitable periods for predicting GSC.
Although we achieved good results in predicting grain quality in this study, the model was only established based on the phenological period of the single hybrid and the single experimental plot. When it was applied to other hybrids or sites in the future, hybrid and soil parameters that affect grain quality should be added to the model for prediction accuracy. However, too many parameters may limit the large-scale application of the model, and thus, the selection of parameters is extremely important.

Conclusions
The MLR model using only remote sensing data underestimated the interference of environmental factors when assessing maize grain quality. The problem of environmental deviation under different sowing dates and years was wellcorrected based on HLM using hyperspectral and meteorological data. The accuracy of grain quality estimation was further improved, i.e., GSC (R 2 = 0.96, RMSE = 0.06, MAE = 0.05), GPC (R 2 = 0.90, RMSE = 0.09, MAE = 0.07), and GOC (R 2 = 0.85, RMSE = 0.10, MAE = 0.08), by combining the predicted values of HLM at different growth stages. These results showed a great potential to predict grain quality at both intra-and inter-annual scales in summer maize through the HLM method and the combination of multiple phases.

Data availability statement
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.

Author contributions
WS and QH conducted the investigation. WS, JL, XX, YW, SZ, SM, and RW implemented all the analyses. WS and QH conceptualized the study. WS prepared the original draft. JL, XX, YW, SZ, SM, and RW prepared sections of the manuscript. QH supervised the experiment. All authors contributed to the article and approved the submitted version.