Improving the estimation of rice above-ground biomass based on spatio-temporal UAV imagery and phenological stages

Introduction Unmanned aerial vehicles (UAVs) equipped with visible and multispectral cameras provide reliable and efficient methods for remote crop monitoring and above-ground biomass (AGB) estimation in rice fields. However, existing research predominantly focuses on AGB estimation based on canopy spectral features or by incorporating plant height (PH) as a parameter. Insufficient consideration has been given to the spatial structure and the phenological stages of rice in these studies. In this study, a novel method was introduced by fully considering the three-dimensional growth dynamics of rice, integrating both horizontal (canopy cover, CC) and vertical (PH) aspects of canopy development, and accounting for the growing days of rice. Methods To investigate the synergistic effects of combining spectral, spatial and temporal parameters, both small-scale plot experiments and large-scale field testing were conducted in Jiangsu Province, China from 2021 to 2022. Twenty vegetation indices (VIs) were used as spectral features, PH and CC as spatial parameters, and days after transplanting (DAT) as a temporal parameter. AGB estimation models were built with five regression methods (MSR, ENet, PLSR, RF and SVR), using the derived data from six feature combinations (VIs, PH+CC, PH+CC+DAT, VIs+PH +CC, VIs+DAT, VIs+PH+CC+DAT). Results The results showed a strong correlation between extracted and ground-measured PH (R2 = 0.89, RMSE=5.08 cm). Furthermore, VIs, PH and CC exhibit strong correlations with AGB during the mid-tillering to flowering stages. The optimal AGB estimation results during the mid-tillering to flowering stages on plot data were from the PLSR model with VIs and DAT as inputs (R 2 = 0.88, RMSE=1111kg/ha, NRMSE=9.76%), and with VIs, PH, CC, and DAT all as inputs (R 2 = 0.88, RMSE=1131 kg/ha, NRMSE=9.94%). For the field sampling data, the ENet model combined with different feature inputs had the best estimation results (%error=0.6%–13.5%), demonstrating excellent practical applicability. Discussion Model evaluation and feature importance ranking demonstrated that augmenting VIs with temporal and spatial parameters significantly enhanced the AGB estimation accuracy. In summary, the fusion of spectral and spatio-temporal features enhanced the actual physical significance of the AGB estimation models and showed great potential for accurate rice AGB estimation during the main phenological stages.

Introduction: Unmanned aerial vehicles (UAVs) equipped with visible and multispectral cameras provide reliable and efficient methods for remote crop monitoring and above-ground biomass (AGB) estimation in rice fields.However, existing research predominantly focuses on AGB estimation based on canopy spectral features or by incorporating plant height (PH) as a parameter.Insufficient consideration has been given to the spatial structure and the phenological stages of rice in these studies.In this study, a novel method was introduced by fully considering the three-dimensional growth dynamics of rice, integrating both horizontal (canopy cover, CC) and vertical (PH) aspects of canopy development, and accounting for the growing days of rice.
Methods: To investigate the synergistic effects of combining spectral, spatial and temporal parameters, both small-scale plot experiments and large-scale field testing were conducted in Jiangsu Province, China from 2021 to 2022.Twenty vegetation indices (VIs) were used as spectral features, PH and CC as spatial parameters, and days after transplanting (DAT) as a temporal parameter.AGB estimation models were built with five regression methods (MSR, ENet, PLSR, RF and SVR), using the derived data from six feature combinations (VIs, PH+CC, PH +CC+DAT, VIs+PH +CC, VIs+DAT, VIs+PH+CC+DAT).

Results:
The results showed a strong correlation between extracted and groundmeasured PH (R2 = 0.89, RMSE=5.08 cm).Furthermore, VIs, PH and CC exhibit strong correlations with AGB during the mid-tillering to flowering stages.The optimal AGB estimation results during the mid-tillering to flowering stages on plot data were from the PLSR model with VIs and DAT as inputs (R 2 = 0.88, RMSE=1111kg/ha, NRMSE=9.76%), and with VIs, PH, CC, and DAT all as inputs (R 2 = 0.88, RMSE=1131 kg/ha, NRMSE=9.94%).For the field sampling data, the ENet model combined with different feature inputs had the best estimation results (% error=0.6%-13.5%),demonstrating excellent practical applicability.
Discussion: Model evaluation and feature importance ranking demonstrated that augmenting VIs with temporal and spatial parameters significantly enhanced the AGB estimation accuracy.In summary, the fusion of spectral and spatio-temporal

Introduction
Rice, as one of the most important staple crops in China, plays a pivotal role in ensuring food security and promoting agricultural sustainability (Huang et al., 2011;Wang et al., 2014).The aboveground biomass (AGB) serves as a key indicator for reflecting rice growth and predicting the yield (Confalonieri et al., 2009b;Choi et al., 2013;Adeluyi et al., 2022).Therefore, obtaining precise AGB estimates during the primary phenological stages of rice is essential for assessing rice growth status, targeting rice management, and optimizing agricultural practices (Gnyp et al., 2013;Zhang et al., 2017).
Conventional methods for obtaining rice growth parameters involve labor-intensive field measurements using various tools (Yao et al., 2013).Although these methods provide relatively accurate data, they are time-consuming and impractical for large-scale measurements (Confalonieri et al., 2009a).Notably, destructive sampling is required for direct AGB measurement, making it unsuitable for long-term monitoring and affecting yield assessment (Borra-Serrano et al., 2019;Nakajima et al., 2023).Also, it is difficult to accurately assess the overall condition of large rice fields due to the limitation of sampling points and spatial heterogeneity (Zhou et al., 2023).In recent years, remote sensing monitoring technologies, based on platforms such as satellites, airborne, unmanned aerial vehicles (UAVs) and ground, have emerged as innovative alternatives for AGB estimation (Duan et al., 2019;Mostofialmamaleki et al., 2019).These platforms are equipped with various sensors, including digital, multispectral, hyperspectral, thermal infrared and synthetic aperture radar (SAR), enabling timely and non-destructive acquisition of canopy information (Chakraborty et al., 1997;Mashimbye et al., 2012;Han et al., 2021;Qiu et al., 2021).
Satellite remote sensing possesses the capability for all-weather, multisource, and multiscale monitoring, allowing synchronous observations over large rice areas (Inoue et al., 2014;Wang et al., 2016;Jiang et al., 2021).However, it is accompanied by drawbacks such as long revisit periods, high satellite costs, and significant interference from cloud cover, posing challenges in ensuring monitoring precision (Feng et al., 2021).The combination of airborne hyperspectral and LiDAR data enables effective monitoring of biomass, but the cost of operating and maintaining airborne platforms is also high, making them more commonly used for large-scale forest monitoring than for crops such as rice (Brovkina et al., 2017;Gao et al., 2022).Ground remote sensing systems are close to the surface, facilitating more detailed information on rice, but are limited in their coverage due to fixed station locations (Gnyp et al., 2012).In comparison, the application of near-ground UAVs provides significant assistance in improving the temporal and spatial resolution of rice monitoring, with the advantages of cost-effectiveness, operational simplicity, and high efficiency in large-scale monitoring (Wan et al., 2020;Bascon et al., 2022;Luo et al., 2022;Xu et al., 2022a).In particular, consumergrade UAVs equipped with visible and multispectral cameras can conveniently obtain canopy structural parameters and key spectral, color and texture features, which have been widely used to estimate AGB (Xu et al., 2022b;Zheng et al., 2023).
Traditional methods for estimating AGB from UAV imagery have mainly relied on vegetation indices (VIs), which have achieved acceptable estimation accuracy (Gnyp et al., 2014;Cheng et al., 2017).Researchers have developed dozens of VIs, such as Normalized difference vegetation index (NDVI), Leaf Chlorophyll Index (LCI) and Soil Adjusted Vegetation Index (SAVI) (Huete, 1988;Peñuelas et al., 1997;Xue and Su, 2017).These dimensionless indices integrate spectral data from different narrow-band wavelengths and can reflect characteristics including coverage, chlorophyll content, moisture, and health status (Bannari et al., 1995).Along with VIs that make use of spectral features, texture indices that reflect subtle variations in canopy structure are also commonly used to enhance the estimation of AGB (Wang et al., 2022b;Liu et al., 2023a).Many studies predominantly focus on establishing empirical relationships directly between these features and ground-measured AGB values using statistical analysis and machine learning algorithms.Typical methods involve Linear Regression, Partial Least Squares Regression, Support Vector Machine, Random Forest, and Artificial Neural Network (Yang et al., 2018;Zhang et al., 2020;Wang et al., 2023a, Wang et al., 2023b).The lack of a solid foundation in physics and physiology will limit the improvement of AGB estimation accuracy.Moreover, crop spectral response can be influenced by various factors, including crop type, moisture content, nutrient status, soil, and atmospheric spatio-temporal variations (Hoffer, 1978;Bannari et al., 1995).Therefore, VIs tend to be unstable during the long growth process of rice and have the potential for saturation under high biomass conditions, posing difficulties for AGB estimation over multiple phenological periods (Yue et al., 2019;Liu et al., 2023c).
To address the limitations of VIs, researchers have integrated plant height (PH) to enhance AGB estimation (Bendig et al., 2015;Panday et al., 2020;Wang et al., 2022a;Yang et al., 2023).Threedimensional point cloud data of the crop canopy can be obtained using photogrammetry or LiDAR technology (Sun et al., 2022).After interpolation, the 3D point cloud data can be generated into a digital surface model (DSM) for extracting plant height.This method has been widely applied to UAV remote sensing monitoring of crop PH with strong physical significance and high accuracy (Kawamura et al., 2020;Ji et al., 2022;Liu et al., 2022c).However, relying solely on PH cannot comprehensively reflect rice growth as it involves only one dimension.To gain a more comprehensive understanding of changes during crop growth, Canopy Coverage (CC) has received attention from researchers (Walter et al., 2015;Shu et al., 2023).Canopy cover refers to the proportion of land covered by the vertical projection of the vegetation canopy (Guevara-Escobar et al., 2005;Lee and Lee, 2011), which can be used to quantify the expansion of the rice canopy in the horizontal dimension as AGB increases.Studies have shown that CC is a reliable parameter for reflecting plant canopy growth and estimating AGB, Leaf Area Index (LAI), and yield (Nielsen et al., 2012;Goodwin et al., 2018;Garcıá-Martıńez et al., 2020).Currently, there are relatively few studies combining PH and CC for AGB estimation, emphasizing the need to strengthen the role of CC.The physical significance of AGB estimation can be further improved from both vertical and horizontal perspectives.
The stages of crop growth can be quantified through various metrics, including time-based measures like Days After Sowing (DAS) and thermal-based measures like Growing Degree Day (GDD) (Li et al., 2022).Additionally, widely adopted classification systems like the Feekes code and Zadoks code delineate distinct phenological stages of crops (Zadoks et al., 1974;Simmons et al., 1985;Chin et al., 1991).Connections can be established between these orderly digitized metrics and the accumulation of crop biomass.Typically, rice biomass can be effectively expressed as a function of time (t) using logistic or Gompertz models (Yu et al., 2002;Varela et al., 2021).Currently, studies have mainly focused on establishing AGB estimation models in specific or overall phenological stages of rice.Among them, the quantitative value for rice growth has received limited attention.Considering the effect of temporal parameters would compensate for the possible saturation and instability of spectral features as the crop grows.Therefore, apart from spectral and spatial parameters, the days after transplanting (DAT) of rice was also introduced as a parameter in this study to investigate its impact on AGB estimation across the main phenological stages.This integrated approach aims to provide more accurate and detailed information for monitoring AGB in rice growth, offering robust support for agricultural management and decision-making.
Considering the limitations of the commonly used methods for estimating rice AGB using VIs and PH, a novel multidimensional approach was introduced in this study that considers both spatial structure and phenological stages.In addition, to ensure the robustness and practical applicability of AGB estimation model, it was imperative to transition from the controlled small-scale experimental plots to the complex reality of large-scale rice fields.The combination of experimental and sampling data could reinforce the reliability of the models, making it a valuable tool for rice monitoring.The main objectives of this study were to (1) obtain 3D point cloud data and multispectral features of rice canopies by UAV-borne digital and multispectral cameras, (2) precisely extract 20 types of VIs, PH and CC based on UAV images from rice fields, (3) estimate rice AGB across main phenological stages with five regression algorithms (MSR, ENet, PLSR, RF and SVR) and six feature combinations (VIs, PH+CC, PH +CC+DAT, VIs+PH+CC, VIs+DAT, VIs+PH+CC+DAT) and evaluate the simulation and accuracy, (4) construct a multidimensional AGB estimation model that integrates spectral, temporal and spatial features of rice.

Study area and experimental design
In this study, small-scale plot experiments and large-scale field sampling were both carried out to validate and assess the applicability of the AGB estimation models developed.The specific scheme is illustrated in Figure 1.The plot experiments were conducted at the experimental site of Hohai University in Nanjing, Jiangsu Province, China (31°55′N, 118°47′E) during 2021 and 2022.The study area was under a humid subtropical climate with an average annual precipitation of 1090.4mm and an average annual temperature of 15.4°C.Japonica rice variety Nanjing 9108 was transplanted into drainage lysimeters at a spacing of 20 cm × 15 cm.All plots were 2.5 meters long, 2 meters wide and 2 meters deep.Five nitrogen fertilizer treatments were used: 0 kg/ha (N0), 150 kg/ ha (N1), 225 kg/ha (N2), 300 kg/ha (N3), and 375 kg/ha (N4), each with four replications.In addition, all treatments were applied with 75 kg/ha of phosphate fertilizer and 120 kg/ha of potash fertilizer.Nitrogen fertilizer was applied in three stages (base, tiller, and spike) at a 4:3:3 ratio, while phosphate fertilizer was applied as a base, and potash fertilizer was split into a base and spike application at a 1:1 ratio.Weeds in the plots were manually removed as appropriate.
To evaluate the AGB estimation models, extensive field sampling occurred in the northern region of Jiangsu Province in October 2021.This field sampling involved multiple areas, including three fields in Xuzhou (34°2′N, 117°52′E) and two fields in Suqian (33°59′N, 118°28′E), as depicted in Figure 1C.Both Xuzhou and Suqian experience a semi-humid monsoon climate, with an annual precipitation of about 900 mm and an average annual temperature of about 15°C.The same variety of japonica rice was also transplanted locally at a density of 300,000 plants per hectare.

Ground data acquisition
Rice plant height (PH) was measured as the vertical height from the soil surface to the top of the canopy using a steel ruler.In plot experiments, six rice plants were selected in each plot for periodic observation to assess the extracted plant height values from the UAV images.The acquisition time of ground data in plot experiments is shown in Table 1.On the day of each destructive sampling, we collected above-ground portions from three representative rice plants in the lysimeter plots and five representative rice plants from the fields.Subsequently, the PH of these plants was measured using the same steel ruler.The aboveground biomass (AGB) of these plants was then determined using the oven-drying method.These plant samples were initially dried at a constant temperature of 105°C for 0.5 hours, followed by a constant temperature of 75°C for 48 hours until reaching a constant temperature.The dry weight of each sample was measured using an electronic balance.For plot experiments, the average AGB of three plant samples per plot, after multiplied by the rice planting density, was converted into the average AGB per hectare (kg/ha).From a total of 10 samplings over two years, 200 groups of data were obtained in plot experiments.For field sampling, the average AGB of five plant samples from each of the five fields was calculated as individual plants (g/plant).

Image acquisition and compositing
The DJI Phantom 4-Multispectral (P4M) (DJ-Innovations, Shenzhen, China) was employed for this experiment, featuring an integrated multispectral imaging system comprising one visible light camera and five multispectral cameras (Blue, Green, Red, Red Edge, and NIR).These cameras were responsible for capturing visible light and multispectral images, respectively, with the single camera boasting an effective pixel count of 2.08 million.Additionally, the DJI P4M was equipped with a Real-Time Kinematic (RTK) system, ensuring centimeter-level positioning accuracy.In positioning mode, the DJI P4M could fly at a maximum horizontal speed of 50 km/h.The vertical hovering accuracy was ±0.1 m and the horizontal hovering accuracy was ±0.1 m.
All image acquisitions were conducted during midday hours (10:00 to 14:00) on clear, cloudless or less cloudy days.Flight missions were planned using the software DJI GS Pro (DJ-Innovations, Layout of the small-scale experimental plots and large-scale sampling fields in this study.Shenzhen, China), with a forward overlap rate of 80% and a side overlap of 70%.The DJI P4M captured images in "waypoint hovering" mode and flew at 18 km/h.The altitude for each mission of the plot experiments and field sampling was set at 10m and 35m, respectively.The corresponding image accuracies were 0.5 cm/px and 1.9 cm/px, respectively.Two calibration boards of known reflectance were also photographed each time the UAV mission was conducted for later radiometric correction of the images.The reflectance of calibration boards in different bands is shown in Table 2.
Before transplanting the rice, and on the day of PH measurement and destructive sampling, UAV missions were conducted to obtain images.All UAV images were reconstructed in 2D multispectral by the software DJI Terra (DJ-Innovations, Shenzhen, China).The 3D point cloud data was rasterized to obtain the DSM.The calibration data was imported for reflectance radiometric correction to generate ortho-mosaic results for each band (Figure 2).

Calculation of vegetation indices, plant height and canopy cover
The elevation data of the images at different growth stages were uniformly corrected by setting fixed control points.Rice PH at different stages was extracted by subtracting the DSM of the rice canopy from the bare soil DSM using the Raster Calculator function of ArcMap 10.8.At the corresponding position of each designated observation plant, a 10 cm × 10 cm sample square was drawn as a polygon element.Then the mean value of all raster data within the frame was calculated as the PH of that rice plant using the Zonal Statistics function of ArcMap 10.8.The PH inputs corresponding to the AGB samples in the regression model were the mean values of PH extracted within the plot or field on the day of destructive sampling.In addition, 20 VI maps were also calculated from five single-band reflectance maps using the Raster Calculator function.The calculation formulas are shown in Table 3.
To facilitate the accurate extraction of rice canopy features from the images, it was essential to exclude the interstitial soil areas between adjacent rice plants.The maximum inter-class variance method (OTSU algorithm) (Otsu, 1979) was applied in MATLAB R2021b to extract the rice canopy.This algorithm returned a threshold for effectively distinguishing between foreground (rice plants) and background (soil).The segmentation threshold for plants and soil was determined using NIR band images in this study.The rice canopy extracted according to this threshold then served as a mask for extracting VI maps.In plot experiments, the outermost two rows of plants on each side of the plots were excluded and the remaining rice canopy pixel points were included in the VI calculation.Additionally, the extracted rice canopy was also used to calculate canopy cover (CC).The CC of each plot or field was the ratio of the rice canopy-covered area to the total area.

Correlation analysis
In this study, correlation analysis was carried out between features (VIs, PH and CC) obtained from UAV images and the   corresponding AGB measured on each sampling day.These comparisons were based on the Pearson correlation coefficient (r) (Lee Rodgers and Nicewander, 1988), a dimensionless metric that assesses the degree of correlation between variables.The value of r ranges from −1 to 1 and an absolute value closer to 1 indicates a stronger correlation between the two variables.The formula for calculating r is as in Equation ( 1): where x and y are the mean values of variables x and y, respectively.The main methods of this study.

Modelling methods
Highly correlated data points from multiple samplings were chosen to create the model dataset.This dataset was divided randomly into two subsets: a training set, including 70% of the dataset, and a separate test set, comprising the remaining 30%.To avoid the performance of the model being affected by the order of magnitude of different features, it was essential to pre-process the data.Standardization was employed, using the mean and standard deviation, to scale and normalize the data consistently (Milligan and Cooper, 1988).The formula for standardization is as in Equation (2): where x and s are the mean and the standard deviation of the variable x, respectively.
Five regression methods were employed in this study, including Multiple Stepwise Regression (MSR), Elastic Net Regression (ENet), Partial Least Squares Regression (PLSR), Random Forest Regression (RF) and Support Vector Regression (SVR).All model construction was completed in Python.
Multiple Stepwise Regression is a method of iteratively checking the significance of each independent variable to finally obtain the independent variable that has a significant effect on the dependent variable (Ku and Popescu, 2019).It includes three methods: forward selection, backward elimination and bidirectional elimination.In this study, forward stepwise regression was used and the adjusted R² value was selected to compare the performance of the model.The independent variables that contributed the most to the model were selected to be added.The final multivariate linear model (Altman and Krzywinski, 2015) was built using those selected independent variables.
Elastic Net Regression is a regularized regression method based on a linear model that combines the L1 and L2 penalties of the Lasso and Ridge methods (Zou and Hastie, 2005).Ridge regression retains all variables in the model and is not applicable to feature selection.Lasso regression does not solve the problem of rational selection among highly correlated features.ENet as a combination of both has advantages in dealing with multicollinearity as well as reducing overfitting.In this study, the strength and scale of regularization were determined based on the parametric grid search.
Partial Least Squares Regression, as a widely used algorithm, can address the problem of covariance between independent variables and enables dimensionality reduction in the latent variable space (Mashimbye et al., 2012).The number of features to be retained after dimensionality reduction can be specified.It can explain as much as possible about the relationship between the original independent variables and the dependent variable providing strong model interpretability.In this study, the optimal number of features after dimensionality reduction was found by loop traversal for different combinations of feature inputs.
Random Forest Regression is an ensemble learning method that operates by constructing a large number of decision trees while training (Izquierdo-Verdiguier and Zurita-Milla, 2020).RF avoids the problem of overfitting in the Decision tree model and it returns the average prediction of individual trees in a regression task.In this study, the number of decision trees, maximum depth, and maximum number of randomly selected features for each decision tree were determined by parametric grid search.Importance scores of features in the RF algorithm were used to support feature selection.
Support vector regression constructs a regression model by finding a set of data points that are closest to the target value in the feature space, which is called a support vector, to fit the data (Smola and Schölkopf, 2004).The goal of SVR is to find the best hyperplane at the maximum interval to minimize the total prediction error.In this study, the type of kernel, regularization parameter, and kernel coefficients used in support vector machines were determined by parametric grid search.
To explore what combination of spectral features, spatial parameters and the temporal parameter can achieve the optimal AGB estimation results, six feature combination scenarios were considered for each method: (1) VIs; (2) PH and CC; (3) VIs, PH, and CC; (4) VIs and DAT; (5) PH, CC, and DAT; (6) VIs, PH, CC, and DAT.Grid search was employed to find a more optimized group of parameters.

Model accuracy assessment
The models built by each method on the training set were crossvalidated with four folds.The K-fold cross-validation is a common method of assessing model performance (Zhang and Liu, 2023) To evaluate the accuracy of the extracted Plant Height (PH) and the performance of the regression models, two metrics were employed: the coefficient of determination (R 2 ) (Karch, 2020), the root mean square error (RMSE) (Hyndman and Koehler, 2006) and the normalized root mean square error (NRMSE) (Khan and Osińska, 2023).R 2 serves as an indicator of the proportion of variance explained by the model relative to the total variance and typically falls between 0 and 1.Higher R 2 values signify better model explanatory power, with values closer to 1 indicating a stronger fit.
RMSE is a widely used statistic for quantifying the discrepancy between estimated and actual values.It's particularly suitable for comparing different methods on the same dataset since it usually has the same units as the estimated and true values.NRMSE is the normalized value of RMSE, usually expressed as a percentage.Smaller values of RMSE and NRMSE indicate a closer correspondence between estimated and actual values, reflecting a superior fitting performance of the model.Conversely, larger RMSE and NRMSE values suggest diminished model accuracy.
The testing results of the AGB estimation model on field samples were assessed using percent error (%error).The formulas for calculating R 2 , RMSE, NRMSE and %error are as in Equations (3-6): where n is the number of data points, x i is the true value of the data point, y i is the estimated value of the data point, x, x max and x min are the mean, maximum and minimum values of true values, respectively.

Model interpretation
In addition to focusing on model accuracy, the correct interpretation of model outputs is also significant for model improvement.Simple models (e.g., linear models) are easy to interpret but may be less accurate.Complex models are usually more accurate but, as a 'black box', often have complicated internal mechanisms that are difficult to explain in concrete ways.For this reason, Shapley additive explanations (SHAP) (Lundberg and Lee, 2017) was used in this study to interpret the predictions of each regression model as a unified measure of the feature importance.The concept of Shapley values was derived from the theory of cooperative games and can be used to measure the marginal contribution of each feature to the final output.SHAP analysis does not depend on the internal mechanism of the model, so it is suitable for various types of models with inputs and outputs, such as linear regression models, tree-based models and neural networks.
In this study, the Shapley value calculation was implemented with the help of a module package in Python.The results were visualized through bee swarm plots, with features ordered by their importance level from top to bottom.Positive Shapley values for each feature signified a positive impact on increasing the predicted value, while negative values indicated a negative impact.Furthermore, the color of the dots represented the specific feature value.

Comparison between measured and extracted PH
The comparison between ground-measured PH and PH extracted from the DSM was conducted from tillering to milk ripening stages (DAT30, DAT40, DAT50, DAT60 and DAT70) in the year 2022, as illustrated in Figure 3.Each measurement involved a total of 120 observation plants distributed across 20 plots.Figure 3 demonstrates a strong correlation between ground-measured PH and extracted PH.Notably, as the rice plants matured, this correlation exhibited a manifest strengthening trend.
In the first three measurements, the R 2 values were 0.72, 0.75 and 0.75, and the RMSE values were 3.89, 4.34 and 5.2 cm, respectively.Remarkably, for the latter two measurements, the R 2 values were 0.84 and 0.88, and the RMSE values were 3.96 and 3.42 Frontiers in Plant Science frontiersin.orgcm, respectively, demonstrating higher correlations.When considering all five measurements collectively, the overall R 2 value reached 0.89 and the RMSE was 5.08 cm, signifying that the utilization of 3D point cloud data obtained from UAV facilitated a convenient and relatively accurate measurement of rice PH.

Correlation analysis between features and AGB at different growth stages of rice
The study examined the correlation between the 20 VIs, PH, CC extracted from UAV images and AGB of rice plants in 2021 and 2022, as illustrated in Figure 4. Overall, correlations between these variables followed a pattern of increasing and then decreasing as the rice grew.Notably, during the mid-tillering to flowering stages (DAT48 and DAT69 in 2021, DAT27, DAT42, DAT52 and DAT68 in 2022), most VIs showed strong correlations with AGB (p< 0.01).In 2021, VIs like GNDVI, LCI, MSR, NDRE and WDRVI reached correlations as high as 0.87 at DAT48.In 2022, VARI had the highest correlation with AGB at DAT68, reaching 0.92.However, during early and late growth stages (DAT114 in 2021, DAT15, DAT82 and DAT109 in 2022), the correlations were weaker or nonexistent.PH and CC also showed significant correlations with AGB, suggesting their potential for improving AGB estimation accuracy based on VIs.

Regression modelling for AGB
Based on the results in section 3.2, a total of 120 groups of data from the six samplings (DAT48 and DAT69 in 2021, DAT27, DAT42, DAT52 and DAT68 in 2022) were collected as a plot dataset to estimate the rice AGB during the major growth stages (from mid-tillering to flowering stages).The training set consisted of 84 groups randomly selected from the dataset.The test set consisted of the remaining 36 groups.The mean results of the four-fold cross-validation on the training set of five regression models are shown in Figure 5, each employing six different combinations of features.When only VIs were used as the independent variable, the MSR and PLSR models had similar accuracy results with R 2 values of 0.73 and RMSE values of 1758 and 1760 kg/ha, respectively.Models constructed from spatial parameters (PH and CC) were all less accurate.Among them, the RF model performed the best (R 2 = 0.69, RMSE=1873 kg/ha).With either VIs as features or PH and CC as features, the mean NRMSE of the four-fold cross-validation of the SVR model was the lowest compared to the other models, namely 13.42% and 16.38%, respectively.With the addition of the temporal parameter (DAT) to the spatial parameters, the accuracy of the other four models, except for the MSR model, improved.In particular, the R 2 value of the RF model increased to 0.75 and the RMSE value decreased to 1699 kg/ha.
As shown in Figure 5, the addition of spatial parameters to the VIs together as input features can significantly improve the estimation accuracy of rice AGB.When VIs, PH and CC were used as model inputs, the MSR and PLSR models had R 2 values that exceeded 0.80, RMSE values of 1501 and 1519 kg/ha, and NRMSE values of 12.09% and 12.36%, followed closely by the ENet, SVR and RF models.Similarly, the combination of VIs and DAT provides better estimation accuracy for AGB than the models constructed from VIs only, but slightly lower than those constructed from VIs, PH and CC.When VIs, PH, CC, and DAT were all taken as the features, the estimation accuracy of all five regression models was also good.The MSR, ENet, PLSR and SVR models obtained quite similar results in terms of R 2 values, with the MSR model having the lowest RMSE value of 1517 kg/ha and NRMSE value of 12.16%.Overall, it was shown that the feature combination of VIs with spatial parameters was superior to that of VIs with spatio-temporal parameters, followed by that of VIs with temporal parameters.

Importance analysis of model features
Figure 6 illustrates the feature importance of the AGB estimation models constructed from VIs and combinations of VIs, PH, CC and DAT.In the models built with VIs only, the index CVI made the highest contribution to the MSR, ENet, PLSR and RF model outputs, and the second highest contribution to the SVR model output.Moreover, other indices that contributed more to the output in several models included RVI, GLI, CIg, EVI and MSR.
When VIs, PH, CC and DAT were all applied as model input, the PH made a significant contribution to the results of the models.In the ENet model, it was the feature PH that contributed the most to the output, and PH ranked second in the PLSR and RF model.The CVI index also contributed greatly to the ENet, RF and SVR model outputs.In addition, the VIs that contributed significantly to the output of each model included RDVI, EVI2 and NDVI in the MSR model, GLI in the ENet, PLSR, RF and SVR model, LCI in the MSR and PLSR model, GNDVI in the PLSR, RF and SVR model.
Furthermore, as can be seen from the distribution of feature value in Figure 6, the sample points with high feature values of VIs such as RDVI, CVI, CIg and LCI have mostly positive SHAP values, while the sample points with low feature values of these VIs have mostly negative SHAP values.This indicated that when the value of these VIs was higher, the estimated AGB value would be larger.On the contrary, the lower their value, the smaller the estimated AGB value would be.Thus, the DVI and EVI2 in the MSR model and GNDVI in the PLSR and SVR model showed the opposite property.Their larger feature values decreased the estimated AGB while smaller feature values increased the estimated AGB.

Testing results of AGB estimation models on experimental plot data
The testing results of the established regression models on the independent test set of the plot dataset are shown in Figure 7.Among the models using VIs as the independent variables, the SVR model had the highest accuracy with an R 2 value of 0.8, an RMSE value of 1432 kg/ha, and an NRMSE value of 12.59%.In addition, the R 2 value of the other four models all exceeded 0.78.AGB models constructed from the spatial parameters PH and CC all had relatively lower testing accuracy with their R 2 values ranging from 0.47 to 0.67 and RMSE values ranging from 1864 to 2340 kg/ha.Also, the addition of the temporal parameter DAT based on PH and CC could improve the testing accuracy of AGB to a certain extent.
Models constructed from combinations of VIs along with PH and CC all achieved relatively high testing results.Among them, the Average accuracy of four-fold cross-validation of the AGB estimation models built from five regression algorithms (MSR, ENet, PLSR, RF and SVR), each using six feature combinations (VIs, PH+CC, VIs+PH+CC, VIs+DAT, PH+CC+DAT, VIs+PH+CC+DAT): MSR model showed the best performance on the test set with an R 2 value of 0.87, an RMSE value of 1146 kg/ha, and an NRMSE value of 10.07%, followed by the SVR, ENet and PLSR models.In the case of VIs and DAT as features, the PLSR model showed the most accurate result with an R 2 value as high as 0.88, an RMSE value of 1111 kg/ha, and an NRMSE value of 9.76%.Furthermore, with all parameters included as the independent variable, again the PLSR model showed the best accuracy (R 2 = 0.88, RMSE=1131 kg/ha, NRMSE=9.94%).

Testing results of AGB estimation models on sampling field data
To explore the applicability of the established AGB estimation models on large-scale fields, the models were tested on five extra fields and the percent error between their estimated and measured values was calculated.The input features considered included VIs and their combinations with spatial and temporal features.The estimation results of different models and their errors are shown in Table 4.
Considering four different feature input situations, the estimation results of the ENet model were optimal on all five fields.When using VIs and DAT as inputs, the ENet model had an average %error of only 5.7% in estimating AGB for the five fields, followed by VIs as inputs (5.8%), VIs, PH and CC as inputs (6.4%), VIs, PH, CC and DAT as inputs (6.4%).For the PLSR method, with all of VIs, PH, CC and DAT as inputs, the best AGB estimation results could be obtained (6.9%).Compared to the other three feature combinations, the MSR and SVR models showed relatively better results when VIs and DAT were used as inputs, with average %error of 17.0% and 20%, respectively.Notably, when the stepwise regression method selected a large number of parameters, it may lead to the constructed linear model having a relatively large error on the new data.For all five fields, the RF model combined with different feature combinations produced results with large errors, with %error ranging from 63% to 75.4%.Overall, the ENet and PLSR models performed better for AGB estimation on large fields compared to the other models.Moreover, the introduction of temporal or spatial parameters can improve the accuracy of these models in estimating AGB in large fields to some extent.Importance ranking of features in different AGB estimation models constructed from two types of feature inputs: (A-E) VIs; (F-J) VIs, PH, CC and DAT.

Performance evaluation of different regression models for estimating AGB
Five regression methods, MSR, ENet, PLSR, RF and SVR, were used in this study to construct the AGB estimation model.In terms of the results of the models on the test set of the plot dataset, the performance of the five models varied when facing high-dimensional data (Figure 7).For the MSR method, the introduction of the DAT feature into the model characterized by VIs, PH and CC did not result in any further improvement in model testing accuracy.Similarly, for the SVR method, the introduction of extra parameters into the model characterized by VIs and DAT or VIs, PH and CC did not result in any improvement in the testing accuracy.This may be attributed to the overfitting caused by a larger number of features (Sim et al., 2018).In comparison, the RF model performed better (Zhang et al., 2022).In this study, feature selection was carried out based on the importance scores of features in the RF algorithm (Li et al., 2017).Whether the DAT feature was introduced over VIs, PH, and CC, or PH and CC over VIs and DAT, the RF model demonstrated a noticeable improvement in the testing accuracy.
In addition, we found that the RF model performs better with high-dimensional data than with low-dimensional data.When two Testing results of the AGB estimation models on the test set of plot dataset:  The MSR, ENet, and PLSR methods are all linear regression methods.When VIs, PH, CC and DAT were all included in the model features, the ENet model had the lowest accuracy.This may be attributed to the possibility that although the ENet model combines the characteristics of Lasso and Ridge regression, it does not eliminate the shortcomings in handling multicollinearity and feature selection.Therefore, some bias and inconsistency in parameter estimation and the problem of high variance in Lasso estimates remain (Kayanan andWijekoon, 2019, 2020).In dealing with multicollinearity and feature selection, the MSR model can obtain a reliable and effective set of feature combinations through stepwise selection (Kolasa-Wiecek, 2015), while the PLSR model can control the complexity of the model by setting the number of Latent Variables to achieve dimensionality reduction (Shen et al., 2020).In this study, the more appropriate process of dimensionality reduction allowed the MSR and PLSR models to outperform the ENet model on the test set.

Combining spatial and temporal parameters to improve the AGB estimation accuracy
In this study, we investigated the accuracy differences of AGB regression models for various combinations of spectral features (VIs), spatial parameters (PH and CC), and the temporal parameter (DAT).As can be seen from Figures 5 and 7, compared with models constructed from spectral features or spatial parameters alone, the combination of these features showed a significant improvement in estimation accuracy.The increase in rice PH in the vertical direction and the expansion of CC in the horizontal direction were generally consistent with the continuous increase of rice AGB.Therefore, the combination of PH and CC can visualize the changes in rice AGB in a 3D space (Maimaitijiang et al., 2019).Similar studies (Adeluyi et al., 2022;Shu et al., 2023) have also suggested the positive significance of PH and CC for AGB estimation, which is consistent with the findings of this study.
Image resolution affects the accuracy of AGB estimation using spectral information and crop surface models (Modica et al., 2020;Liu et al., 2021).Low spatial resolution reduces image quality.High spatial resolution brings more details of the crop canopy but captures environmental noise from soil, weeds, and shadows at the same time (Liu et al., 2022b;Zhu et al., 2023).Considering the large areas of the sampling fields, we did not employ the same flight height of image acquisition as the experimental plots in terms of practicality.The five groups of field samples were mainly used to test the applicability of the models.Although the spatial resolution of the large sampling fields (1.9 cm/px) was slightly lower than that of the experimental plots (0.5 cm/px), the ENet and PLSR models still achieved good estimation results in this study.We will consider constructing estimation models with greater applicability across field data in the future.In addition, the impact of extracted PH tending to be lower than the actual PH values needs to be further investigated (Willkomm et al., 2016;Ji et al., 2022).Here, the underestimation of PH may be attributed to the sparse canopy point cloud, which could be occasionally mistaken for the soil point cloud when the rice canopy cover was minor.We will further calibrate the extracted PH values in subsequent studies to match the actual conditions more closely and thus improve the accuracy.In addition, the CC in this study was calculated by extracting the rice canopy using only one algorithm, OTSU.Combining methods such as machine learning may improve the accuracy of CC and further enhance the accuracy of AGB estimation.
From a temporal perspective, the growth of rice biomass conforms to the rule of the Logistic curve, which can be characterized as slow growth at the initial stage, rapid growth in the middle stage, slow growth in the later stage, and eventually converging to the maximum value infinitely (Yu et al., 2002).Therefore, the growing days of rice can also function as a parameter for estimating AGB and can compensate for the possible saturation of VIs, PH, and CC in the later stages of rice growth.In this study, the results show that introducing the temporal parameter DAT based on VIs can indeed improve the estimation accuracy of AGB.In particular, when the spatial parameters of rice cannot be obtained due to objective conditions, adding DAT to spectral features to estimate AGB is also an alternative.On the whole, the feature combinations of VIs, PH, CC, and DAT showed superior evaluation performance for all five modelling approaches (Figure 7).For the MSR and ENet models, the combination of VIs, PH, CC, and DAT demonstrated similar test accuracy to the combination of VIs, PH, and CC; for the PLSR model, the combination of VIs, PH, CC, and DAT demonstrated similar test accuracy to the combination of VIs and DAT.
Overall, we took advantage of the convenience of obtaining, processing and analyzing point cloud data and multispectral remote sensing images.Considering that the information on multispectral bands is limited and sensitive to the atmosphere and clouds, integrating multi-source remote sensing data is one future study target (Yu et al., 2023;Zhai et al., 2023;Liu et al., 2023b).For instance, hyperspectral remote sensing, synthetic aperture radar (SAR), and thermal infrared remote sensing (TIR) could provide more abundant spectral and temperature information, and help to minimize the influence of weather and environmental factors (Alebele et al., 2020;Liu et al., 2022a;Xu et al., 2023;Zhang et al., 2024).Spectral analysis methods such as wavelet analysis could also be employed to provide more integrated and in-depth data (Wang et al., 2022b).

Generalization ability of regression algorithms on different nitrogen samples
To explore the generalization ability of each regression algorithm, samples with low-nitrogen (N1) and high-nitrogen (N4) treatments were used for testing, respectively and the remaining treatments were used as the training set.Thus, two scenarios were included, one using the N1, N2, and N3 sample set to train the model and test it on the N4 sample, and the other using the N2, N3, and N4 sample set to train the model and test it on the N1 sample.VIs, PH, CC and DAT were all taken as model features.The constructed models were also crossvalidated with four folds and the average accuracy results are shown in Figure 8.Among the models trained from the N1, N2 and N3 sample set, the MSR model had the highest R 2 value of 0.76 and the lowest RMSE value of 1461 kg/ha, followed by the SVR, ENet and PLSR models.As for the models trained from the N2, N3 and N4 sample set, the RF, SVR and MSR models had similar accuracy, with their R 2 values exceeding 0.69.Overall, the accuracy of the AGB estimation models constructed from the N1, N2, and N3 sample set was higher than that of models constructed from the N2, N3, and N4 sample set.
The testing results of the models for the two scenarios above are shown in Figure 9.The estimated results for the majority of the N4 sample points were smaller than the measured values, while those for the majority of the N1 sample points were larger than the measured values.The better models for estimating the N4 treatment were the ENet and SVR models with their R 2 values up to 0.81.The PLSR and RF models had similar estimation accuracy, with R 2 values of 0.80.In addition, the best model for estimating N1 treatment was the PLSR model with an R 2 value of 0.80, an RMSE value of 1362kg/ha and an NRMSE value of 13.77%.The RF model had the lowest estimation accuracy for the N1 sample, with an R 2 value of only 0.52.Although the accuracy of the modelling set performed well, the MSR model obtained the lowest accuracy in predicting the N4 samples.This may be attributed to overfitting.Studies have shown that small-scale training may cause the problem of overfitting (Dai et al., 2022).For a small training set consisting only some of the samples, MSR selected a relatively large number of features, resulting in a good performance on the training set and a poor adaptation on unseen data.A similar result was observed when the RF model predicted N1 samples.Therefore, a large and representative amount of training data is crucial to achieve a good generalization ability of the constructed model, especially when employing machine learning algorithms (Xin et al., Testing results of different AGB estimation models trained from the N1, N2 and N3 sample set on the highest nitrogen treatment (N4) sample (A-E) and models trained from the N2, N3 and N4 sample set on the lowest nitrogen treatment (N1) sample (F-J).Dai et al. 10.3389/fpls.2024.1328834Frontiers in Plant Science frontiersin.org2020; Cao et al., 2021).In general, the ENet and PLSR models had better generalization performance for new data from both highnitrogen and low-nitrogen samples in this study.

Conclusions
To further enhance the accuracy of estimating rice AGB through VIs, this study proposed an innovative multidimensional estimation method that comprehensively considers spectral features, spatial structure, and phenological stages.We employed UAV-borne visible and multispectral cameras to capture digital surface models and canopy multispectral images of rice at various phenological stages.From these, we accurately extracted 20 VIs, PH, and CC.The AGB estimation models across phenological stages were constructed using five regression methods (MSR, ENet, PLSR, RF and SVR) with six different feature combinations (VIs alone, PH+CC, PH+CC+DAT, VIs+PH+CC, VIs+DAT, VIs+PH+CC+DAT).Effective testing was conducted on both plot and field experiments.The findings demonstrate the potential of drone technology for rapid and relatively accurate estimation of rice PH.Notably, the introduction of spatial parameters (PH and CC) and the temporal parameter (DAT) significantly improved the accuracy of AGB estimation.Among the models, the PLSR model with VIs and DAT as inputs, or with VIs, PH, CC, and DAT as inputs, achieved the optimal AGB estimation on plot data.Meanwhile, the ENet model showed the best estimation results on field data, highlighting its strong practical applicability.In summary, the integration of UAV remote sensing and multi-feature fusion provides an effective way to accurately estimate rice AGB.This will not only provide dependable technical guidance for remote sensing monitoring and field management of rice but also contribute to promoting the advancement of large-scale smart agriculture and precision farming.
(A) Location of the plots and fields in Jiangsu Province.(B) Plot experiments in Hohai University.(C) Field sampling in Xuzhou (Field 1, Field 2 and Field 3) and Suqian (Field 4 and Field 5).
. It divides the dataset into K folds and trains the model K times.A different fold is used as the validation set each time and the remaining folds are used as the training set.The results of each training are averaged to obtain the final model performance estimate.The cross-validation can effectively reduce the model overfitting problem and improve the reliability and generalization performance (Vu et al., 2022).
FIGURE 3 Correlation between the ground-measured plant height (PH) and extracted PH from UAV images from tillering to milk ripening stages: (A, B) tillering stage; (C) jointing-booting stage; (D) heading-flowering stage; (E) milk ripening stage; (F) total.

FIGURE 4
FIGURE 4 Pearson correlation coefficient (r) between 20 vegetation indices (VIs), plant height (PH), canopy cover (CC) with above-ground biomass (AGB) at different growth stages of rice.* and ** indicate significance at the 0.05 and 0.01 significance levels, respectively.
FIGURE 8 Accuracy of different AGB estimation models trained from the N1, N2 and N3 sample set and the N2, N3 and N4 sample set: (A) R 2 comparison; (B) RMSE comparison; (C) NRMSE comparison.

TABLE 2
Multispectral spectrum and reflectance of calibration boards.

TABLE 1
Ground data measurement in plot experiments.

TABLE 3
Vegetation indices (VIs) used in this study.

TABLE 4
Testing results of AGB estimation models on large-scale field data.PH+CC, PH+CC+DAT) were used as model inputs, the training results of the RF model were both optimal, with R 2 values of 0.69 and 0.75, and RMSE values of 1873 and 1699 kg/ ha, respectively.However, the accuracy of the RF model on the test set was the lowest when PH and CC were used as features, with an R 2 value of only 0.47 and an RMSE value exceeding 2000 kg/ha.In contrast, the testing accuracy remained high when PH, CC, and DAT were used as features.