- 1Institute for Interdisciplinary and Innovative Research, Xi’an University of Architecture and Technology, Xi’an, China
- 2Urban Horticulture Research and Extension Center, Shanghai Chenshan Botanical Garden, Shanghai, China
- 3College of Architecture, Nanjing Tech University, Nanjing, China
- 4Ecology Research Institute, Shanghai Academy of Environmental Sciences, Shanghai, China
Urban forest parks are vital ecological barriers that safeguard urban ecological security and provide essential ecosystem services. Aboveground biomass (AGB) is a key indicator for evaluating these services. This study targeted three tree species—Ligustrum lucidum, Camphora officinarum and Koelreuteria paniculata—in Haiwan National Forest Park of Shanghai, China. Based on field-measured individual tree AGB, high-density point clouds from terrestrial laser scanning (TLS), and features from UAV multispectral imagery, four machine learning models—Random Forest (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Support Vector Regression (SVR)—were developed. SHapley Additive exPlanations (SHAP) analysis was conducted to identify key predictors and quantify their importance. The results show that: (1) Data fusion of TLS and multispectral imagery significantly, improves estimation accuracy compared with single data sources, with RF consistently achieving the best performance across species (test set R2 = 0.96, 0.92, and 0.91 for L. lucidum, C. officinarum, and K. paniculata, respectively). (2) The effectiveness of data fusion varies by species: for C. officinarum and K. paniculata, fusion models outperformed TLS-only models by 2% and 5% in R2, respectively; for L. lucidum, fusion accuracy (R2 = 0.92) was comparable to TLS alone, both outperforming multispectral-only models. (3) SHAP analysis indicates that structural features from TLS—particularly the interaction between tree height and volume—dominate AGB estimation, contributing over 70% of the total feature importance, while spectral and vegetation index features (e.g., RE, NDVI, OSAVI) contribute about 20%. These findings demonstrate that integrating multi-source remote sensing data enables efficient and precise individual tree AGB estimation tailored to different species, providing a technical basis for intelligent monitoring of urban forests in megacity Shanghai.
1 Introduction
Urban forests constitute an important component of urban ecosystems, as they not only enhance urban ecological environments and underpin the sustainable development of urban society and economy but also provide ecosystem services to urban residents, fostering harmonious coexistence between humans and nature (Erb et al., 2018). As the primary component of urban forests, trees are key determinants of the stability, productivity, and carbon storage capacity of urban ecosystems. Forest aboveground biomass (AGB) refers to the aboveground parts of plants, encompassing leaves, stems, dead branches, flowers, and fruits (Obata et al., 2023). Urban forest AGB serves as a critical indicator of urban forest’ carbon sequestration capacity and a key parameter for assessing urban forest carbon budgets (Zhang and Liang, 2020). Individual trees, as the fundamental units of forests, are the primary contributors to urban carbon sinks, effectively mitigating the urban heat island effect and improving air quality. Accurate estimation of individual tree AGB is therefore essential for understanding their carbon sequestration functions and provides a critical foundation for biomass assessment at both regional and urban scales (Wang et al., 2023).
Traditional methods for measuring forest AGB, primarily relying on manual field surveys, are plagued by drawbacks such as time-intensive procedures and limited spatiotemporal flexibility (Ge et al., 2022). With advancements in remote sensing technology, scholars worldwide have begun estimating forest AGB by combining remote sensing data with limited ground measurements, spanning scales from individual trees to stands and regional levels (Li et al., 2015). While extensive research has been conducted on biomass estimation at the stand and regional scales, studies on individual tree AGB inversion remain in their early stages. This is partly due to challenges in acquiring high-resolution remote sensing data and the limited applicability of allometric growth equations to urban greenery trees.
Unmanned aerial vehicle (UAV) multispectral imagery, which provides centimeter level spatial resolution, extracts texture and spectral features at the individual tree scale for AGB inversion (Guo et al., 2015; Huang et al., 2019). Terrestrial laser scanning (TLS), which capture detailed point cloud data of under-canopy tree trunks, compensates for UAV multispectral limitations by extracting diameter at breast height (DBH) or creating three-dimensional tree models, enabling their synergy for high-precision biomass estimation (LaRue et al., 2020; Puletti et al., 2025). Sun et al. (2015) conducted TLS-only scans on 45 pure cypress plantations at Huangfengqiao Forest Farm in Hunan Province of China, with comparisons showing tree positional error under 20 cm and root mean square error (RMSE) under 5% for both estimated DBH and tree height. Cui et al. (2025) studied a 1-hectare mixed forest plot of Larix gmelinii and Betula nigra at the Mengjiagang Forest Farm in Heilongjiang Province of China, using TLS to extract individual tree attributes, achieving exceeded 95% average accuracy for DBH and tree height for both tree species and developing a tree height prediction model. Ye et al. (2025) used UAV LiDAR data, in conjunction with three machine learning models and SHAP interpretability analysis, to estimate the individual tree DBH and AGB of Eucalyptus globulus and E. grandis in the Marlborough region of New Zealand. Brede et al. (2022) used TLS to empirically quantify and assess the structural volume of Southern pine deadwood according to decay grades. Jiang et al. (2025) utilized Sentinel data and airborne LiDAR data, combined with eight predictive models, to construct a three-stage optimization strategy that enhances the accuracy of ground-level biomass inversion for natural forests in Guangdong Province, China. Wu et al. (2025) employed satellite radar data and four machine learning models to estimate the biomass of natural forests in Hunan, China, while also analyzing the drivers of biomass spatial distribution. Zhen et al. (2025) proposed a hybrid framework integrating the individual tree-based approach and area-based approach using multi-source remote sensing data from natural forests in northeastern China, providing technical support for estimating AGB from individual trees to large forest areas. On one hand, existing studies on biomass estimation based on TLS-UAV fusion mainly focus on pure coniferous forests, and have established a relatively comprehensive research system in terms of method adaptability and accuracy verification (Verkerk et al., 2019; Heinrich et al., 2021; Sun et al., 2015; Puletti et al., 2020). In contrast, although studies on broad-leaved forests—especially coniferous-broad-leaved mixed forests in urban areas—have been conducted, research on key aspects such as collaborative extraction of multi-tree species features and biomass inversion under human-disturbed scenarios remains insufficient, and no universal technical framework has been formed yet. On the other hand, most of the existing relevant studies rely on single UAV LiDAR data, and the R2 value for individual tree biomass inversion generally remains around 0.8 (Wang et al., 2023; Wu et al., 2024; Burt et al., 2019; Itakura et al., 2022). In this study, however, by combining the ability of TLS to accurately characterize the three-dimensional structure of trees with the spectral and textural features of UAV multispectral data, an inversion model integrating multi-source features was constructed. Ultimately, the R2value for aboveground biomass inversion of individual trees was increased to over 0.9, which is superior to the accuracy level of existing studies.
First, the deep forest algorithm was used to accurately screen target vegetation samples suitable for biomass inversion needs. On this basis, TLS structural data were fused with the rich canopy spectral texture information from UAV multispectral data. Combined with four machine learning algorithms and SHAP (SHapley Additive exPlanations) interpretability analysis, feature selection and high-precision individual-tree biomass inversion were conducted for three typical urban greening tree species in Shanghai, namely, Ligustrum lucidum, Camphora officinarum, and Koelreuteria paniculata. Specifically, the study will explore the flowing aspects: 1) the effect of multi-source remote sensing data fusion on enhancing the accuracy of urban individual tree AGB estimation; 2) the applicability and performance differences of various machine learning algorithms in urban individual tree AGB estimation; 3) quantifying the contribution of different remote sensing features to individual tree AGB inversion through SHAP analysis and revealing their underlying mechanisms. This study will provide technical support and scientific basis for urban forest carbon sink monitoring, ecosystem service assessment, and management in Shanghai and other similar regions.
2 Materials and methods
This study systematically conducted individual tree above-ground biomass estimation work through five stages: data collection and preprocessing, feature extraction, feature selection, regression model construction, and accuracy evaluation (Figure 1). First, field measurement data and multi-source remote sensing data were preprocessed to generate individual tree AGB, individual tree point clouds, and individual tree multispectral images. Subsequently, eight structural features were extracted from the individual tree point clouds; seven spectral indices and eight texture indices related to individual tree crown area were extracted from the individual tree multispectral images. Next, three datasets were constructed for each tree species: point cloud features, multispectral features, and combined features from both sources. The SHapley Additive exPlanations (SHAP) feature selection method was applied to identify the top 10 most important features from each dataset. Finally, random forest (RF) (Singh et al., 2023), extreme gradient boosting (XGBoost) (Chen et al., 2020), support vector machine regression (SVR) (Alagulakshmi et al., 2025), and Light Gradient Boosting Machine (LightGBM) (Nguyen et al., 2024) were used to model AGB for the three tree species, and select the optimal AGB estimation model.
Figure 1. Technical roadmap for tree AGB inversion by fusing multi-source remote sensing and ground data.
2.1 Study area
The Haiwan National Forest Park (30°51′–30°52′N, 121°40′–121°43′E), covering a forest area of 634.7 hectares, stands as a prime example of Shanghai’s large-scale urban forest parks, boasting abundant artificial forest resources and a diverse composition of tree species (Figure 2). The annual average temperature is 16.1 °C, with an annual average precipitation of 1,190 mm. The terrain slopes slightly from northwest to southeast, with minimal elevation changes, characteristic of a typical coastal plain landscape. The soil originated from multiple land reclamation projects in Hangzhou Bay, exhibits weak alkalinity. Dominant tree species include Ligustrum lucidum, Camphora officinarum, Triadica sebifera, Camptotheca acuminata, and Koelreuteria paniculata, mainly consisting of evergreen and deciduous broad-leaved species in urban forests of Shanghai (Shang et al., 2013).
Figure 2. Location map of the study area in Shanghai. Plot denoted by the red rectangular frame on the right 3D map, specifying the exact sampling plot within the study area.
2.2 Multi-source data collection
2.2.1 Ground-measured forest AGB
A 1-hm2 (100 m × 100 m) near-natural artificial forest plot was established in the Haiwan National Forest Park of Shanghai for ground surveys. The selected plot features a typical tree species composition, favorable growth conditions and minimal human disturbance to ensure the representativeness and reliability of the data. In July 2024, the research team conducted angular measurements of the plot using Real-Time Kinematic (RTK) technology, alongside manual field surveys to record parameters such as the latitude and longitude, tree height, and DBH of each tree, totaling 435 individuals (Table 1).
Individual tree AGB was calculated using non-destructive allometric growth equations based on DBH. Equations 1–3 were developed and validated based on long term research results on common greenery tree species in the Shanghai area (Wang et al., 2014; Zhang et al., 2018; Guo, 2017) with R2 values of 0.9788, 0.9861, and 0.8900, respectively, these models accurately characterize the biomass properties of L. lucidum, C. officinarum, and K. paniculata in this study.
For L. lucidum:
For C. officinarum:
For K. paniculata:
Where W is the aboveground biomass of an individual tree (unit: kg); D is the measured diameter at breast height (unit: cm).
2.2.2 UAV multispectral data
UAV multispectral imagery was acquired using a DJI Mavic 3 drone equipped with four 1/2.8″CMOS sensors. Four monochromatic sensors cover the following wavebands respectively: green (G) (560 ± 16) nm; red (R) (650 ± 16) nm; red edge (RE) (730 ± 16) nm; and near-infrared (NIR) (840 ± 26) nm. Multispectral imagery was acquired on 10 August 2024, under clear and windless conditions. The drone was set to fly at a constant speed of 3.0 m/s at an altitude of 120 m, using equidistant photography mode with a 2.0-s interval between shots. During the data acquisition phase, a DJI multispectral radiation calibration panel was placed in a flat and unobstructed area within the sample plot. To account for changes in illumination throughout the entire flight process, 3 sets of calibration panel images were captured respectively before each takeoff and after each landing of the UAV to complete the radiation reference calibration. After the acquisition work was finished, DJI Terra 4.1.0 (https://www.dji.com/nz/dji-terra) was used to perform geometric correction and mosaicking on the aerial images, and finally, an orthomosaic image with a spatial resolution of 2.2 cm/pixel was generated.
2.2.3 TLS data acquisition
The SLAM200 is the high-precision 3D LiDAR scanner, which has a point cloud positioning accuracy of ±5 mm and an effective scanning radius of 300 m, was used to conduct a comprehensive scan around the 1-hectare plot on 29 July 2024, following a “return” route. To ensure data integrity, control targets were placed every 100 m to enable precise stitching of adjacent scan stations. The point cloud density was set to 10,000 points per square meter, resulting in a total of 26.1 GB of point cloud data, which covers complete three-dimensional spatial information of the tree layer, shrub layer, and surface microtopography within the sample plot.
Ground point classification is a fundamental operation in point cloud data processing. LiDAR360 employs the Improved Progressive TIN Densification (IPTD) algorithm (Zhao et al., 2016) to classify ground points. This algorithm first generates a sparse triangulated network using seed points, then iteratively densifies it layer by layer until all ground points are classified. The specific steps are as follows:Step 1, Initial seed point selection. For point cloud data containing buildings, the maximum building dimension is measured to determine the grid size for grid-based processing. For point cloud data without buildings, the default grid size is used. The lowest point within the grid is selected as the initial seed point. Step 2, Triangulated Irregular Network (TIN) construction. The initial TIN is constructed using the selected seed point. Step 3, Iterative refinement process. Iterate through all points awaiting classification. For each point, determine the triangle containing its horizontal projection. Calculate the distance d from the point to the triangle and the maximum angle formed between the point and any of the triangle’s three vertices relative to the triangle’s plane, as shown below. Compare these values against the iteration distance and iteration angle. If both are below the corresponding threshold, classify the point as a ground point and add it to the triangular mesh. Repeat this process until all ground points are classified.
This study employed a distance-based Euclidean Cluster Extraction algorithm to perform individual tree segmentation on TLS point cloud data. The Euclidean Cluster Extraction algorithm employed in this study is a point cloud grouping method based on spatial distance metrics (Yang et al., 2023). Its core logic involves aggregating spatially adjacent points into independent clusters by determining whether the Euclidean distance between any two points in the point cloud falls below a predetermined threshold, thereby achieving segmentation of discrete objects. This algorithm relies on the KD-tree data structure to optimize nearest neighbor query efficiency, enabling rapid processing of large-scale LiDAR point cloud data.
2.3 Multi-source data fusion
2.3.1 Vegetation classification
Prior to individual tree segmentation and feature extraction, accurately distinguishing trees from non-trees in UAV multispectral imagery is a critical preprocessing step for achieving the accuracy of subsequent analyses. In this study, spectral features, vegetation indices, and texture features derived from UAV multispectral data were integrated, and the Deep Forest (DF) algorithm was employed for vegetation classification (Zhou and Feng, 2018). Vegetation indices are effective indicators for evaluating vegetation growth conditions. Selecting appropriate vegetation indices can significantly improve vegetation classification accuracy and biomass inversion accuracy (Chen et al., 2109). Using UAV multi-spectral data as the source, five bands—red, green, blue, near-infrared (NIR), and red edge (RE)—were acquired, and seven vegetation indices were calculated, as listed in Table 2. The DJI Mavic 3 drone used can simultaneously capture GSDDSM data in addition to the aforementioned five spectral bands. GSDDSM (Ground Sampling Distance-Digital Surface Model) is a derivative product of the Digital Surface Model (DSM) that integrates ground sampling distance (GSD). By associating DSM elevation with GSD parameters and performing calibration, GSDDSM preserves elevation information while incorporating pixel-scale features. This enables precise characterization of the “elevation-spatial scale” variations in terrain features, supporting detailed identification of small-scale objects.
Texture features characterize the horizontal structural attributes of an image, reflecting spatial variation patterns and the spatial correlation of grayscale values. They are widely applied in vegetation classification, forest biomass estimation, and related field (Chen et al., 2020). This study utilizes the texture feature function glcmTexture, calculated based on the Gray-Level Co-occurrence Matrix (GLCM) provided by Google Earth Engine. The calculation employed 64 gray levels, a 3 × 3 pixel moving window, and four directional offsets (0°, 45°, 90°, and 135°). The eight most commonly used features selected are the angular second moment, contrast, homogeneity, correlation, dissimilarity, variance, mean, and entropy (Soh, 1999; Aerts et al., 2014). To reduce data redundancy while preserving spectral information, principal component analysis (PCA) was applied to derive three principal components, and the first principal component (PCA1) was used to calculate eight texture features.
The Deep Forest algorithm is a non-neural network-based deep learning method based on ensemble learning. It primarily consists of multiple cascaded forest structures and a multi-grained scanning module. Unlike traditional deep learning methods, it does not require a large amount of data for pre-training, has lower data dependency, and exhibits strong interpretability (Zhou and Feng, 2018). During model development, hierarchical random sampling was applied to the preprocessed data, with the dataset split into a training set and a testing set in a 7:3 ratio. A random seed is set to ensure the reproducibility of results. Model performance was evaluated using accuracy, precision, recall, and F1-score metrics (Equations 4-7).
TP (True Positive) and TN (True Negative), respectively represent the counts of samples correctly predicted as positive and negative. Conversely, FP (False Positive) and FN (False Negative) are the counts of samples incorrectly predicted as positive and negative.
2.3.2 Biomass model based on fusion of UAV multispectral and TLS data
The acquired TLS data were imported into Feima Robotics1.7.2 (https://feimarobotics.com/zhcn), where the raw point cloud data, obtained from scanning, was stitched and denoised. High-precision registration of point cloud coordinates with field-measured individual tree coordinates was achieved using the least squares method. Use LiDAR360 V7.2 (https://www.lidar360.com/LiDAR360) to crop the sample point cloud data and remove outlier noise points using a 3-standard deviation threshold. Subsequently, Gaussian filtering is applied to smooth the point cloud, completing point cloud normalization and individual tree segmentation.
The crown vector data obtained from the TLS-based individual tree segmentation were calibrated and aligned with multispectral data using control points. Based on the aligned crown boundaries, spectral reflectance, vegetation indices, and texture features of individual trees are extracted from the multispectral data; eight individual tree structural parameters, including tree height, crown area, and tree volume, are obtained from the TLS data (Table 3). Three datasets were constructed for each tree species: point cloud features, multispectral features, and a combined dataset of both.
2.4 Variable selection methods
SHAP is an explainable machine learning tool based on game theory concepts. It calculates the marginal contribution of each feature to the model’s output, thereby quantifying their relative importance in the model’s decision-making process. The SHAP values are derived from the Shapley value concept, which allocates a marginal contribution to each feature such that the sum of all contributions equals the difference between the model’s prediction and a baseline value, typically the mean of the target variable (Li, 2022). Let
where
Model interpretation in SHAP begins with the construction of an explainer, which supports various model types, including deep, gradient, kernel, tree, and sampling explainers (Chen et al., 2022). For tree-based models—such as XGBoost, LightGBM, and CatBoost—SHAP’s tree explainer offers both efficient and accurate feature attribution. Global interpretability in SHAP assesses overall feature importance across the entire dataset, where SHAP values farther from zero indicate greater contribution to model output (Chen et al., 2023). Each feature can exert both positive and negative influences, depending on its directional effect on the prediction. In this study, SHAP was applied in conjunction with multiple machine learning models, including RF, SVR, XGBoost, and LightGBM. For each model, SHAP values were calculated and ranked by absolute magnitude. The top 10 features were selected as key variables to improve both predictive performance and computational efficiency.
2.5 Model evaluation
The datasets from the three data sources (TLS point cloud data, UAV multispectral data, and combined data) were divided into training and test sets using a stratified random sampling method with a ratio of 7:3. The training set was optimized using grid search combined with five-fold cross-validation. Four machine learning algorithms—RF, XGBoost, SVM, and LightGBM—were used to build individual tree AGB regression models. The model construction process was implemented in a Python 3.7 environment using the Scikit-Learn library. The coefficient of determination (R2) and root mean square error (RMSE) were chosen to quantitatively validate and assess the accuracy of the biomass inversion model (Forkuor et al., 2020; Ma et al., 2025). The R2 value ranges from 0 to 1, with values closer to 1 indicating higher estimation accuracy. The RMSE represents the difference between the predicted and measured values, with smaller values indicating higher estimation accuracy. The calculation formulas are as follows (Equations 9, 10);
In the equations, n represents the sample size,
3 Results
3.1 Classification of vegetation and non-vegetation
Accurate classification of vegetation and non-vegetation is a critical preprocessing step for achieving high-precision individual tree segmentation and biomass estimation. In this study, the Deep Forest algorithm was applied to classify vegetation and non-vegetation in preprocessed UAV multispectral imagery. The results indicated that classification based solely on RGB data achieved an accuracy of 0.88. When RGB data was combined with the RE band and GSDDSM, the classification accuracy improved to 0.94 and 0.92, respectively. When combining seven spectral indices, there were significant differences in the improvement of classification accuracy among different spectral indices, with OSAVI and NDVI performing particularly well, achieving classification accuracies of 0.94 and 0.89, respectively. Further combining the best-performing OSAVI spectral index with GSDDSM and RE yielded high-precision classification results of 0.98 and 0.96, respectively. In the feature combination experiments of RGB and seven texture indices, contrast and entropy showed the most significant improvement effects, with classification accuracies of 0.91 and 0.90, respectively. When contrast, the highest-performing texture index, was combined with GSDDSM and RE, improved to 0.95 and 0.97, respectively. Ultimately, the feature combination of RGB + OSAVI +mean + GSDDSM + RE achieved the highest accuracy of 0.99, effectively integrating spectral, texture, terrain, and vegetation physiological characteristics for precise vegetation and non-vegetation classification (Figure 3).
Figure 3. Vegetation and non-vegetation classification. (a) Vegetation and non-vegetation classification map of Haiwan National Forest Park based on the Deep Forest algorithm, (b) Local classification results, (c) Classification accuracy of different combinations.
3.2 Individual tree segmentation
Accurate individual tree segmentation is an essential step in obtaining structural parameters such as tree height and crown diameter at the individual-tree scale, directly impacting the accuracy of subsequent biomass estimation. This study employed a distance-based Euclidean clustering algorithm to perform individual tree segmentation on TLS point cloud data, and then matched the segmented individual tree crown diameters with UAV multispectral data (Figure 4). Field surveys revealed that crown overlap and interference from small trees often lead to over-segmentation or under-segmentation during individual tree identification, thereby affecting the accuracy of structural parameter extraction. Consequently, manual editing of incorrectly segmented trees, followed by recalculation of their attributes, was required. A total of 435 trees were surveyed in the field, with 369 identified through preliminary automated segmentation. After manual post-processing, 350 high-quality individual tree samples were obtained, comprising 200 L. lucidum, 100 C. officinarum, and 50 K. paniculata. Evaluation of segmentation performance indicated a recall rate of 84.83%, a precision of 93.18%, and an F1 score of 88.76% for individual-tree identification in this study.
Figure 4. Individual tree segmentation and crown outline based on TLS. (a) Individual tree crown outline and morphological structure extracted from TLS data, (b) Register the crown vector boundary obtained from TLS segmentation with multispectral data.
3.3 Variable screening results
To elucidate the contribution of each remote sensing feature to individual tree AGB estimation and to identify the key driving factors, this study employed the SHAP method to quantify and rank feature importance. For each tree species, the top 10 features with the highest SHAP scores were selected as preferred features for the three types of datasets (TLS, multispectral, and fusion data). When modelling with TLS data alone, the combined importance of total volume (TV) and tree height (H) for AGB in-version in L. lucidum reached 62.53% (Figure 5a); the combined importance of TV and H for C. officinarum reached 58.41% (Figure 5c); and the combined importance of TV and the natural logarithm of H [Ln(H)] for K. paniculata reached 51.90% (Figure 5e). When modelling using only multispectral data, the dependence on spectral features varies significantly among tree species: the importance of NDRE and EVI features for L. lucidum is 25.44%, RE and RVI for C. officinarum is 16.72%, and red and green for K. paniculata is 30.04%. In the combined TLS–multispectral dataset, three of the five most important variables for all three species were TLS-derived features—TV, H, and H2—which together accounted for more than 60% of the total importance in AGB in-version. Spectral vegetation features, including red, RE, NDVI and OSAVI, contributed approximately 20% (Figures 5b,d,f). Comparing the feature importance distributions across the three datasets reveals that structural features derived from TLS point cloud data dominate in individual tree AGB inversion, indicating that tree geometric parameters have core explanatory power for biomass estimation, while spectral information from multispectral data plays a supplementary role.
Figure 5. Analysis of the top 10 features of three trees based on RF models and SHAP quantification. Top 10 variables as quantified by the mean of absolute SHAP values. Each point represents an individual tree’s SHAP value for that feature, with color indicating the feature’s value (low = blue to high = red): (a) L. lucidum, using only the top 10 importance features of TLS data, (b) L. lucidum, the top 10 important features from the combined data of TLS and UAV multispectral imagery, (c) C. officinarum, using only the top 10 importance features of TLS data, (d) C. officinarum, the top 10 important features from the combined data of TLS and UAV multispectral imagery,(e) K. paniculata, using only the top 10 importance features of TLS data,(f) K. paniculata, the top 10 important features from the combined data of TLS and UAV multispectral imagery.
3.4 Biomass estimation and model accuracy comparison
This study employed TLS point cloud data and UAV multispectral imagery as data sources, constructing three datasets: TLS data, multispectral data, and their com-bination. These were combined with field-measured AGB measurements for L. lucidum, C. officinarum, and K. paniculata to establish four models—RF, XGBoost, LightGBM, and SVR—for AGB estimation and accuracy comparison. The relevant results are shown in Table 4 and Figures 6, 7.
Figure 6. AGB inversion of three tree species using the RF model. (a) C. officinarum (using only TLS data), (b) L. lucidum (using only TLS data), (c) K. paniculata (using only TLS data), (d) C. officinarum (using TLS and multispectral data), (e) L. lucidum (using TLS and multispectral data), (f) K. paniculata (using TLS and multispectral data).
Figure 7. Inversion of single tree AGB for three tree species. (a) L. lucidum, (b) C. officinarum, (c) K. paniculata.
For three tree species, the accuracy of the four-biomass inversion models using single multispectral data was lower than that using single TLS data. For L. lucidum, using TLS data for AGB prediction resulted in a 78.43% improvement in the test set R2 compared to using only multispectral data. The accuracy of the combined data and TLS data models was comparable, with the optimal model being RF, achieving a test set R2 of 0.92 and RMSE values of 1.14 kg and 1.15 kg, respectively (Table 4; Figures 6b,e). For C. officinarum, using TLS data for AGB prediction resulted in a 71.2% increase in test set R2 compared to using only multispectral data. When combining TLS data and multispectral data for AGB prediction, the prediction accuracy of all four models improved to varying degrees. The optimal model for the combined data was RF, which improved the test set R2 by 2% and reduced the RMSE by 6.61 kg compared to the RF model using only TLS data (Table 4; Figures 6a,d). For K. paniculata, the combined dataset substantially improved the predictive performance of all four models compared with multispectral-only modeling. Compared with TLS-only modeling, the RF model on the combined dataset achieved the best performance, surpassing the other three models. The optimal model for the combined data was RF, which improved the test set R2 by 65.29% compared to the optimal model using multispectral data, and improved the test set R2 by 5% and reduced the RMSE by 2.14 kg compared to the optimal model using TLS data (Table 4; Figures 6c,f).
After evaluating model accuracy, this study integrated TLS point cloud data and UAV multispectral imagery, applying the RF model to perform spatial inversion mapping of AGB for the three target tree species in the study area. The results showed that the estimated mean AGB of L. lucidum was 10.45 kg; the estimated average AGB of C. officinarum was 107.13 kg; and the estimated average AGB of K. paniculata was 33.06 kg. The spatial distribution aligns well with actual survey results, validating the effectiveness of this model in spatial inversion within Shanghai’s urban forest parks.
4 Discussion
4.1 Improving classification accuracy by combining multi-source feature fusion with the deep forest algorithm
This study utilizes multispectral imagery from unmanned aerial vehicles (UAVs) to extract spectral features, vegetation indices, texture features, and terrain features. The Deep Forest algorithm is employed to classify vegetation and non-vegetation. Results demonstrated that integrating multiple feature types substantially improved the model’s classification accuracy in complex landform conditions. Specifically, classification using only RGB data achieved an accuracy of 0.88, underscoring its limited ability to discriminate complex landforms. Combining the RE band, which is sensitive to vegetation physiological conditions, with GSDDSM significantly improved classification accuracy to 0.94 and 0.92, respectively, indicating that structural and physiological information plays a crucial role in distinguishing trees from non-trees (Abdollahnejad and Panagiotidis, 2020). In experiments combining seven spectral indices, the influence of different indices on classification accuracy varied notably. Among them, OSAVI and NDVI performed exceptionally well, as they effectively reduced soil background interference and highlighted vegetation information, achieving classification accuracies of 0.94 and 0.89, respectively. When OSAVI was combined with GSDDSM and RE, respectively, classification accuracy further improved to 0.98 and 0.96, validating the synergistic enhancement effect of multi-source feature fusion (Purwanto et al., 2023). In the combination experiments of RGB with eight texture indices, contrast and entropy performed best, as they effectively capture image spatial structure and complexity, achieving classification accuracy rates of 0.91 and 0.90, respectively. Combining the highest-accuracy contrast texture indices with GSDDSM and RE achieved classification accuracy rates of 0.95 and 0.97, respectively. Ultimately, adopting a feature fusion scheme integrating RGB, OSAVI, contrast, GSDDSM, and RE significantly improved classification accuracy to 0.99, demonstrating the clear advantages of multi-source remote sensing feature synergy (Gutkin et al., 2023).
Guo et al. (2022) utilized UAV multispectral data, selecting spectral reflectance, vegetation indices, and geometric texture features, and combined object-oriented classification methods with random forest algorithms to classify trees in Fuzhou City’s urban forests, achieving an accuracy of 91.89%. In contrast, this study expanded and optimized feature selection, incorporating not only the aforementioned basic features but also terrain features and specific wavelength information more sensitive to vegetation physiological states, resulting in a more diverse and comprehensive feature combination. In terms of algorithm selection, this study introduced the Deep Forest algorithm, which possesses strong feature learning capabilities and complex pattern recognition capabilities, to enhance the expressive performance of the classification model. Through expanding feature dimensions and improving algorithmic performance, the study ultimately achieved higher tree classification accuracy, further validating the benefits of multi-source feature fusion and advanced algorithms for complex object classification tasks.
4.2 SHAP-based interpretability analysis of individual tree AGB
When using TLS data modeling alone, TV and H are always the core features for estimating the aboveground biomass of individual trees for L. lucidum, C. officinarum, and K. paniculata. Specifically, TV and H accounted for 62.53% of the total feature importance for L. lucidum, 58.41% for C. officinarum, and 51.90% for K. paniculata. This indicates that structural characteristics (especially volume and height), as direct quantitative indicators of tree three-dimensional morphology, are key factors determining individual tree biomass, as biomass is primarily distributed in the trunk and large branches of the woody parts. This result is consistent with the studies by Cao et al. (2025) and Whelan et al. (2023), which indicate that measurements based on tree wood volume have the potential to characterize complex forest structures and enhance model generalizability when predicting AGB in different forest types. When modeling using only multispectral data, the importance of spectral features is relatively lower than that of TLS structural features, but they still exhibit significant species specificity. For L. lucidum, NDRE and EVI together accounted for 25.44% of the importance in AGB estimation, which can be attributed to their high sensitivity to chlorophyll content and canopy density. For C. officinarum, RE and RVI accounted for 16.72% of the importance, while for K. paniculata, red and green light features contributed the most, reaching 30.04%. These differences reflect the biological and physical characteristics of different tree species in terms of leaf pigment composition, canopy structure, or light response. When further integrating TLS with multispectral data for modeling, SHAP analysis results highlight the dominant role of TLS structural features in AGB estimation. Among the top five most important variables for all three species, three were TLS features—TV, H, and H2, whose combined importance exceeded 60% in AGB prediction. This further underscores the irreplaceable role of high-precision three-dimensional structural information in individual tree biomass estimation. Meanwhile, the contribution rates of spectral vegetation features such as Red, RE, NDVI, and OSAVI are approximately 20%, indicating that they serve as supplementary information to complement physiological or canopy surface information not captured by structural features, thereby further enhancing model performance (Oehmcke et al., 2024).
A comprehensive comparison of the feature importance distributions of the three datasets clearly reveals that the three-dimensional structural features of TLS point cloud data play an absolutely dominant role in individual tree AGB inversion, indicating that tree geometric parameters have core explanatory power for biomass estimation. In contrast, the spectral and texture information provided by multispectral data plays a key supplementary role in reflecting tree physiological status and canopy surface reflectance characteristics.
4.3 Impact of model selection on individual tree AGB estimation
In this study, different modeling algorithms showed varying performance in individual tree AGB prediction. This indicates that the selection of modeling methods should comprehensively consider factors such as data type, data quality, and tree species characteristics of the study area to achieve optimal prediction results. Four machine learning algorithms were evaluated for biomass estimation in Shanghai’s public forests. The RF model consistently demonstrated the best performance across the three tree species. SVM is sensitive to kernel function selection and parameter tuning, but has low computational efficiency when handling large-scale point cloud-derived features; LightGBM and XGBoost, as efficient gradient boosting models, demonstrate good performance in most prediction tasks (Figure 8). However, due to their sensitivity to outliers and multicollinearity among features, and given that the biomass of trees in Shanghai’s urban forests exhibits certain outliers, these two models may experience overfitting during training (Zhou et al., 2021; Huang et al., 2013). In contrast, the RF model can better prevent overfitting and address complex nonlinear relationships between variables. These findings are consistent with those of Gao et al. (2022), who reported the superior performance of RF in biomass estimation for subtropical forests. This finding suggests that the choice of modeling methods should comprehensively account for factors such as data type, data quality, and species-specific characteristics of the study area to achieve optimal predictive performance.
Figure 8. Heatmap of optimal variables after variable selection using SHAP. (a) L. lucidum, (b) C. officinarum, (c) K. paniculata.
It is noteworthy that the RMSE values for camphor trees were significantly higher than those for privet and goldenrain trees. This is because RMSE reflects the absolute deviation between model predictions and actual values, and its magnitude changes proportionally with the increase in the base number of actual values. The core reason for camphor tree’s significantly higher RMSE compared to the latter two species is directly related to the distribution characteristics of actual aboveground biomass values across different tree species: Field measurement data indicate that camphor tree exhibits a wider range of aboveground biomass values (maximum 538 kg, average 197 kg), while the average aboveground biomass values for privet and goldenrain tree are only 55 kg and 101 kg respectively, both substantially lower than that of camphor tree.
The influence of fusing TLS and multispectral data on individual tree AGB inversion accuracy differs among the three tree species, which is closely related to the matching degree between tree species characteristics and data information capture capabilities. The accuracy of the L. lucidum data model (test set R2 = 0.92, RMSE = 1.15 kg) is comparable to that of the TLS data model alone (test set R2 = 0.92, RMSE = 1.14 kg). This may be because L. lucidum has a dense canopy and stable leaf biochemical characteristics, with its biomass primarily determined by the structural characteristics of the trunk and branches. TLS can sufficiently capture its key three-dimensional structural information, whereas multispectral data has limited ability to penetrate the dense canopy and capture additional meaningful physiological traits (Khatri-Chhetri et al., 2024). Consequently, the fusion of the two datasets does not lead to a substantial improvement in prediction accuracy. In contrast, the crown of C. officinarum is relatively open, and its leaf biochemical characteristics exhibit some heterogeneity, enabling multispectral data to complement TLS structural information to a certain extent. Consequently, the accuracy of the fusion model (R2 = 0.96, RMSE = 50.67 kg on the test set) outperforms that of the TLS model alone (R2 = 0.94, RMSE = 57.28 kg on the test set). The accuracy of the combination data model for K. paniculata (R2 = 0.91, RMSE = 12.87 kg on the test set) was 5% higher than that of the TLS data model alone (R2 = 0.86, RMSE = 15.01 kg) by 5%. This improvement can be attributed to the species’ deciduous nature, characterized by a high proportion of leaf biomass and pronounced seasonal variation, with biomass driven by both structural and biochemical factors. Multispectral data thus provides complementary information on fine-scale canopy structure and leaf traits, resulting in marked enhancement in prediction accuracy. In summary, the effectiveness of data fusion in individual tree AGB estimation is not universally applicable; its validity must be analyzed and judged based on the structural morphology and physiological characteristics of different tree species.
4.4 Limitations and future prospects
This study proposes a research framework for estimating individual tree AGB using ground-based LiDAR and UAV remote sensing in synergy, laying a crucial foundation for enhancing the intelligent monitoring of urban forest resources. However, certain limitations exist. First, individual tree segmentation still relies on manual intervention, which not only increases the time cost of data processing but also reduces the automation level of the model. Future efforts should focus on optimizing deep learning-based segmentation algorithms. By leveraging the spatiotemporal synergies of multi-source remote sensing data, we can address the challenge of automated segmentation in complex canopy scenarios, thereby improving the efficiency of technology implementation. Second, the 1-hectare plot size and limited sample size across three tree species restrict the generalizability of results. The study area covers only typical vegetation zones within Shanghai’s urban forest parks, and the heterogeneity of urban forests may reduce the model’s applicability in other regions. Future research should expand plot coverage and enhance spatial representativeness through stratified sampling.
5 Conclusion
This study developed a remote sensing estimation framework for the urban individual tree aboveground biomass, integrating TLS data, UAV multispectral imagery, machine learning algorithms and SHAP interpretability analysis. This provides critical support and technical pathways for the rapid prediction of fine-scale biomass and intelligent ecological management of urban forests in Shanghai. The fused model outperforms single-source data, with R2 values of 0.96 (L. lucidum), 0.92 (C. officinarum), and 0.91 (K. paniculata), confirming multi-source fusion’s advantages in urban forest biomass inversion. Species-specific analysis shows that C. officinarum and K. paniculata, benefit more from fusion due to loose canopies structure and dynamic leaves, while L. lucidum, achieves high accuracy with TSL alone, which due to its dense canopy and stable biochemical characteristics. The RF model performs robustly, and SHAP analysis highlights structural features (height, volume and their three derived indicators) contributing over 60% in biomass inversion. These findings provide a basis for refined management of urban tree species. Future research can expand the coverage of sample plots and improve the spatial representativeness of samples through stratified sampling, while focusing on optimizing automated segmentation algorithms based on deep learning to improve segmentation accuracy in complex canopy scenarios.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Author contributions
PL: Visualization, Investigation, Formal Analysis, Software, Methodology, Writing – original draft. YZ: Investigation, Visualization, Formal Analysis, Writing – original draft, Methodology. JR: Writing – original draft, Data curation. GZ: Writing – review and editing, Investigation. JT: Data curation, Writing – review and editing. QW: Writing – review and editing, Data curation, Funding acquisition. KS: Project administration, Writing – review and editing, Supervision, Conceptualization, Funding acquisition.
Funding
The authors declare that financial support was received for the research and/or publication of this article. This study was funded by the Shanghai Science and Technology Innovation Action Plan (Grant number 23dz1204500, 23dz1204501).
Acknowledgements
We would like to thank the Shanghai Advanced Research Institute, Chinese Academy of Sciences, China, for their assistance with data collection and technical support.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The authors declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Abdollahnejad, A., and Panagiotidis, D. (2020). Tree species classification and health status assessment for a mixed broadleaf-conifer forest with UAS multispectral imaging. Remote Sens. 12, 3722. doi:10.3390/rs12223722
Aerts, H. J. W. L., Velazquez, E. R., Leijenaar, R. T. H., Parmar, C., Grossmann, P., Carvalho, S., et al. (2014). Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 5, 4006. doi:10.1038/ncomms5006
Alagulakshmi, R., Ramalakshmi, R., Veerasimman, A., and palani, G. (2025). A comparative analysis on usage of ANN and SVR algorithms for predicting the mechanical properties of natural fiber-based composites using experimental data. Int. J. Mater. Form. 18, 73. doi:10.1007/s12289-025-01938-z
Ayu, K. R., Sukojo, B. M., and Mukti, D. A. R. (2024). Correlation analysis of vegetation index impact on rice paddy productivity estimation using Landsat-8 and Sentinel-2A images (case study: blitar district). IOP Conf. Ser. Earth Environ. Sci. 1418, 12007. doi:10.1088/1755-1315/1418/1/012007
Baret, F., and Guyot, G. (1991). Potentials and limits of vegetation indices for LAI and APAR assessment. Remote Sens. Environ. 35, 161–173. doi:10.1016/0034-4257(91)90009-U
Brede, B., Terryn, N., Barbier, N., Bartholomeus, M. B., Bartolo, R., Calders, K. C., et al. (2022). Non-destructive estimation of individual tree biomass: allometric models, terrestrial and UAV laser scanning. Remote Sens. Environ. 280, 113180. doi:10.1016/j.rse.2022.113180
Burt, A., Disney, M., and Calders, K. (2019). Extracting individual trees from lidar point clouds using treeseg. Methods Ecol. Evol. 10, 438–445. doi:10.1111/2041-210X.13121
Cao, Y. Q., Zhao, Y. Y., Xu, J. E., Fang, Q., Xuan, J., Huang, L., et al. (2025). UAV-LiDAR-based study on AGB response to stand structure and its estimation in Cunninghamia Lanceolata plantations. Remote Sens. 17, 2842. doi:10.3390/rs17162842
Chen, Y., Li, L., Lu, D., and Li, D. (2019). Exploring bamboo forest aboveground biomass estimation using sentinel-2 data. Remote Sens. 11, 7. doi:10.3390/rs11010007
Chen, Z. Y., Jin, J. Q., Zhang, R., Zhang, T. H., Chen, J. J., Yang, J., et al. (2020). Comparison of different missing-imputation methods for MAIAC (Multiangle implementation of atmospheric correction) AOD in estimating daily PM2.5 levels. Remote Sens. 12, 3008. doi:10.3390/rs12183008
Chen, H., Lundberg, S. M., and Lee, S. I. (2022). Explaining a series of models by propagating shapley values. Nat. Commun. 13, 4512. doi:10.1038/s41467-022-31384-3
Chen, H., Covert, I. C., Lundberg, S. M., and Lee, S. I. (2023). Algorithms to estimate shapley value feature attributions. Nat. Mach. Intell. 5, 590–601. doi:10.1038/s42256-023-00657-x
Cui, Y. J., Jia, W. W., Wang, F., Guo, H. T., and Li, D. D. (2025). Extraction of individual tree factor and tree height model construction of larix olgensis−Fraxinus mandshurica mixed forest based on TLS data. J. Southwest For. Univ. 45, 142–150. doi:10.11929/j.swfu.202401064
Erb, K. H., Kastner, T., Plutzar, C., Bais, A. L. S., Carvalhais, N., Fetzel, T., et al. (2018). Unexpectedly large impact of forest management and grazing on global vegetation biomass. Nature 553, 73–76. doi:10.1038/nature25138
Forkuor, G., Zoungrana, J. B. B., Dimobe, K., Ouattara, B., Vadrevu, K. P., and Tondoh, J. E. (2020). Above-ground biomass mapping in West African dryland forest using Sentinel-1 and 2 datasets - a case study. Remote Sens. Environ. 236, 111496. doi:10.1016/j.rse.2019.111496
Gao, L., Wang, X. F., Johnson, B. A., Tian, Q. J., Wang, Y., Verrelst, J., et al. (2020). Remote sensing algorithms for estimation of fractional vegetation cover using pure vegetation index values: a review. ISPRS J. Photogramm. 159, 364–377. doi:10.1016/j.isprsjprs.2019.11.018
Gao, L. H., Chai, G. Q., and Zhang, X. L. (2022). Above-ground biomass estimation of plantation with different tree species using airborne LiDAR and hyperspectral data. Remote Sens. 14, 2568. doi:10.3390/rs14112568
Ge, J., Hou, M. J., Liang, T. G., Feng, Q. S., Meng, X. Y., Liu, J., et al. (2022). Spatiotemporal dynamics of grassland aboveground biomass and its driving factors in North China over the past 20 years. Sci. Total Environ. 826, 154226. doi:10.1016/j.scitotenv.2022.154226
Guo, X., Zhang, H., Yuan, T., Zhao, J., and Xue, Z. (2015). Detecting the temporal scaling behavior of the normalized difference vegetation index time series in China using a detrended fluctuation analysis. Remote Sens. 7 (10), 12942–12960. doi:10.3390/rs71012942
Guo, X. Y. (2017). Multi-scale assessments on ecological quality of urban forest in Shanghai. Shanghai: China: East China Normal University. Doctoral thesis.
Guo, Q., Zhang, J., Guo, S. J., Ye, Z. X., Deng, H., Hou, X. L., et al. (2022). Urban tree classification based on object-oriented approach and random forest algorithm using unmanned aerial vehicle (UAV) multispectral imagery. Remote Sens. 14, 3885. doi:10.3390/rs14163885
Gutkin, N., Uwizeyimana, V., Somers, B., Muys, B., and Verbist, B. (2023). Supervised classification of tree cover classes in the complex mosaic landscape of Eastern Rwanda. Remote Sens. 15, 2606. doi:10.3390/rs15102606
Heinrich, V. H. A., Dalagnol, R., Cassol, H. L. G., Rosan, T. M., de Almeida, C. T., Silva Junior, C. H. L., et al. (2021). Large carbon sink potential of secondary forests in the Brazilian Amazon to mitigate climate change. Nat. Commun. 12, 1785. doi:10.1038/s41467-021-22050-1
Huang, J. L., Ju, W. M., Zheng, G., and Kang, T. T. (2013). Estimation of forest aboveground biomass using high spatial resolution remote sensing imagery. Acta Ecol. Sin. 33, 6497–6508. doi:10.5846/stxb201212211841
Huang, H. B., Liu, C. X., Wang, X. Y., Zhou, X. L., and Gong, P. (2019). Integration of multi-resource remotely sensed data and allometric models for forest aboveground biomass estimation in China. Remote Sens. Environ. 221, 225–234. doi:10.1016/j.rse.2018.11.017
Itakura, K., Miyatani, S., and Hosoi, F. (2022). Estimating tree structural parameters via automatic tree segmentation from LiDAR point cloud data. IEEE J-STARS 15, 555–564. doi:10.1109/JSTARS.2021.3135491
Jiang, W., Zhang, L., Zhang, X., Gao, S., Gao, H., Sun, L., et al. (2025). Multi-decision vector fusion model for enhanced mapping of aboveground biomass in subtropical forests integrating Sentinel-1, Sentinel-2, and airborne LiDAR data. Remote Sens. 17 (7), 1285. doi:10.3390/rs17071285
Khatri-Chhetri, P., van Wagtendonk, L., Hendryx, S. M., and Kane, V. R. (2024). Enhancing individual tree mortality mapping: the impact of models, data modalities, and classification taxonomy. Remote Sens. Environ. 300, 113914. doi:10.1016/j.rse.2023.113914
LaRue, E. A., Wagner, F. W., Fei, S. L., Atkins, J. W., Fahey, R. T., Gough, C. M., et al. (2020). Compatibility of aerial and terrestrial LiDAR for quantifying forest structural diversity. Remote Sens. 12, 1407. doi:10.3390/rs12091407
Li, Z. (2022). Extracting spatial effects from machine learning model using local interpretation method: an example of SHAP and XGBoost. Comput. Environ. Urban. 96, 101845. doi:10.1016/j.compenvurbsys.2022.101845
Li, W., Niu, Z., Wang, C., Gao, S., Feng, Q., and Chen, H. Y. (2015). Forest above-ground biomass estimation at plot and tree levels using airborne Li DAR data. J. Remote Sens. 19, 669–679. doi:10.11834/jrs.20154116
Li, X. M., Zhao, D., Chen, J. H., Wu, J., Mu, X., Zheng, Z. J., et al. (2025). Assessing the individual and combined contributions of stand age and tree height for regional-scale aboveground biomass estimation in fast-growing plantations. Remote Sens. 17, 2958. doi:10.3390/rs17172958
Ma, J. N., Zhang, C., Ou, C., Qiu, C., Yang, C. C., Wang, B. B., et al. (2025). Estimation and change analysis of grassland AGB in the China–Mongolia–Russia border area based on multi-source geospatial data. Remote Sens. 17, 2527. doi:10.3390/rs17142527
Nguyen, H. D., Nguyen, Q. H., Dang, D. K., Van, C. P., Truong, Q. H., Pham, S. D., et al. (2024). A novel flood risk management approach based on future climate and land use change scenarios. Sci. Total Environ. 921, 171204. doi:10.1016/j.scitotenv.2024.171204
Nurmasari, Y., and Wijayanto, A. W. (2021). Oil palm plantation detection in Indonesia using Sentinel-2 and Landsat-8 optical satellite imagery (case study: Rokan Hulu regency, Riau province). Int. J. Remote Sens. 18, 1–18. doi:10.30536/j.ijreses.2021.v18.a3537
Obata, A., Yoshida, T., and Hiura, T. (2023). Estimation of stand biomass and species-specific biomass in Japanese northern mixed forests in 1920–1930s: understanding environmental factors affecting carbon sequestration before recent climate change. Ecol. Indic. 54, 110495. doi:10.1016/j.ecolind.2023.110495
Oehmcke, S., Li, L., Trepekli, K., Revenga, J. C., Nord-Larsen, T., Gieseke, F., et al. (2024). Deep point cloud regression for above-ground forest biomass estimation from airborne LiDAR. Remote Sens. Environ. 302, 113968. doi:10.1016/j.rse.2023.113968
Puletti, N., Grotti, M., Ferrara, C., and Chianucci, F. (2020). Lidar-based estimates of aboveground biomass through ground, aerial, and satellite observation: a case study in a mediterranean forest. J. Appl. Remote Sens. 14, 044501. doi:10.1117/1.JRS.14.044501
Puletti, N., Innocenti, S., Guasti, M., Alvites, C., and Ferrara, C. (2025). Improving aboveground biomass estimation in beech forests with 3D tree crown parameters derived from UAV-LS. Remote Sens. 17, 1497. doi:10.3390/rs17091497
Purwanto, A. D., Wikantika, K., Deliar, A., and Darmawan, S. (2023). Decision tree and random forest classification algorithms for mangrove forest mapping in sembilang national park, Indonesia. Remote Sens. 15, 16. doi:10.3390/rs15010016
Shang, K. K., Zheng, S. J., and Zhang, Q. F. (2013). Characteristic of plant community structure in a 1 hm2 plot of the Haiwan National forest park of Shanghai and the significance of its dynamics monitoring. J. Ecol. Rural. Environ. 29, 316–321. doi:10.3969/j.issn.1673-4831.2013.03.008
Singh, A., Kushwaha, S. K. P., Nandy, S., Padalia, H., Ghosh, S., Srivastava, A., et al. (2023). Aboveground forest biomass estimation by the integration of TLS and ALOS PALSAR data using machine learning. Remote Sens. 15, 1143. doi:10.3390/rs15041143
Soh, L. K., and Tsatsoulis, C. (1999). Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE T. Geosci. Remote. 37, 780–795. doi:10.1109/36.752194
Song, K. X., Jiang, F. G., Hu, Z. D., Lv, Y. B., Long, Y., Deng, M. L., et al. (2023). Remote sensing inversion of above-ground biomass of grassland in the Tibet Autonomous Region. Acta Ecol. Sin. 43, 5600–5613. doi:10.5846/stxb202206061593
Sun, H., Wang, G. X., Li, H., Li, J. P., Zhang, H. Q., and Ju, H. B. (2015). Retrieval and accuracy assessment of tree and stand parameters for Chinese fir plantation using terrestrial laser scanning. IEEE Geosci. Remote Sens. Lett. 12, 1993–1997. doi:10.1109/LGRS.2015.2443553
Verkerk, P. J., Fitzgerald, J. B., Datta, P., Dees, M., Hengeveld, G. M., Lindner, M., et al. (2019). Spatial distribution of the potential forest biomass availability in Europe. For. Ecosyst. 6, 5. doi:10.1186/s40663-019-0163-5
Wang, Z., Du, B. M., Han, Y. J., Cui, X., Huang, D., Xu, C. Y., et al. (2014). Carbon storage of Ligustrum lucidum plantations in Shanghai out-loop forest belt. Chin. J. Ecol. 33, 910–914. doi:10.13292/j.1000-4890.2014.0089
Wang, F., Sun, Y. M., Jia, W. W., Zhu, W. C., Li, D. D., Zhang, X. Y., et al. (2023). Development of estimation models for individual tree aboveground biomass based on TLS-derived parameters. Forests 14, 351. doi:10.3390/f14020351
Whelan, A. W., Cannon, J. B., Bigelow, S. W., Rutledge, B. T., and Sánchez Meador, A. J. (2023). Improving generalized models of forest structure in complex forest types using area- and voxel-based approaches from lidar. Remote Sens. Environ. 284, 113362. doi:10.1016/j.rse.2022.113362
Wu, Z. J., Yao, F. M., Zhang, J. H., and Liu, H. Y. (2024). Estimating forest aboveground biomass using a combination of geographical random forest and empirical bayesian kriging models. Remote Sens. 16, 1859. doi:10.3390/rs16111859
Wu, Y., Chen, Y., Tian, C., Yun, T., and Li, M. (2025). Estimation of subtropical forest aboveground biomass using active and passive sentinel data with canopy height. Remote Sens. 17 (14), 2509. doi:10.3390/rs17142509
Yang, W., Yang, X. B., Zhang, L., Fan, X. J., Ye, Q. L., and Fu, L. Y. (2023). Individual tree segmentation and tree-counting using supervised clustering. Comput. Electron. Agric. 205, 107629. doi:10.1016/J.COMPAG.2023.107629
Ye, N., Mason, E., Xu, C., and Morgenroth, J. (2025). Estimating individual tree DBH and biomass of durable Eucalyptus using UAV LiDAR. Ecol. Inf. 89, 103169. doi:10.1016/j.ecoinf.2025.103169
Zhang, Y. Z., and Liang, S. L. (2020). Fusion of multiple gridded biomass datasets for generating a global forest aboveground biomass map. Remote Sens. 12 (16), 2559. doi:10.3390/rs12162559
Zhang, X. J., Leng, H. B., Zhao, G. Q., Jing, J., Song, K., and Da, L. J. (2018). Allometric models for estimating aboveground biomass for four common greening tree species in Shanghai city, China. J. Nanjing For. Univ. 42, 141–146. doi:10.3969/j.issn.1000-2006.201704025
Zhang, L. J., Zhang, X. X., Shao, Z. F., Jiang, W. H., and Gao, H. M. (2023). Integrating Sentinel-1 and 2 with LiDAR data to estimate aboveground biomass of subtropical forests in northeast Guangdong, China. Int. J. Digit. Earth 16, 158–182. doi:10.1080/17538947.2023.2165180
Zhao, X., Guo, Q., Su, Y., and Xue, B. (2016). Improved progressive TIN densification filtering algorithm for airborne LiDAR data in forested areas. ISPRS J. Photogramm. Remote Sens. 117, 79–91. doi:10.1016/j.isprsjprs.2016.03.016
Zhen, Z., Li, X., Ma, Y., Zhao, Y., and Wang, X. (2025). A hybrid method for forest aboveground biomass estimation: fusion of individual tree- and area-based approaches over northeast China. GISCI REMOTE Sens. 62 (1), 2497629. doi:10.1080/15481603.2025.2497629
Keywords: terrestrial laser scanning, unmanned aerial vehicle imagery, aboveground biomass, machine learning, Shapley additive explanations
Citation: Luo P, Zhang Y, Ruan J, Zhang G, Tan J, Wang Q and Shang K (2025) Estimation of individual tree biomass for three tree species using LiDAR and multispectral data in megacity Shanghai. Front. Remote Sens. 6:1697927. doi: 10.3389/frsen.2025.1697927
Received: 02 September 2025; Accepted: 20 November 2025;
Published: 03 December 2025.
Edited by:
Milind B. Ratnaparkhe, ICAR Indian Institute of Soybean Research, IndiaReviewed by:
Licheng Zhao, Chinese Academy of Sciences (CAS), ChinaWenhao Jiang, Beijing Normal University, China
Copyright © 2025 Luo, Zhang, Ruan, Zhang, Tan, Wang and Shang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Kankan Shang, c2hhbmdrYW5rYW5AMTYzLmNvbQ==
Yanwen Zhang2,3