Dynamic optimization of stand structure in Pinus yunnanensis secondary forests based on deep reinforcement learning and structural prediction

Zhao, Jian; Wang, Jianming; Yin, Jiting; Chen, Yuling; Wu, Baoguo

doi:10.3389/fpls.2025.1610571

ORIGINAL RESEARCH article

Front. Plant Sci., 15 October 2025

Sec. Functional Plant Ecology

Volume 16 - 2025 | https://doi.org/10.3389/fpls.2025.1610571

Dynamic optimization of stand structure in Pinus yunnanensis secondary forests based on deep reinforcement learning and structural prediction

Jian Zhao¹

Jianming Wang^1*

Jiting Yin²

Yuling Chen³

Baoguo Wu⁴

¹School of Mathematics and Computer Science, Dali University, Dali, China
²Dali Forestry and Grassland Science Research Institute, Dali, China
³Institute of Remote Sensing and Geographic Information System, School of Earth and Space Sciences, Peking University, Beijing, China
⁴School of Information Science and Technology, Beijing Forestry University, Beijing, China

Introduction: The rational structure of forest stands plays a crucial role in maintaining ecosystem functions, enhancing community stability, and ensuring sustainable management. Although progress has been made in stand structure optimization, most existing studies focus on static improvements and fail to adequately capture the dynamic nature of stand development. In addition, commonly used heuristic and traditional methods often suffer from limitations in computational efficiency and generalization ability.

Methods: To address these challenges, this study explores the potential and advantages of multi-agent deep reinforcement learning in forest management, offering innovative insights and methods for achieving sustainable forest ecosystem management. Using the secondary forests of Pinus yunnanensis in southwest China as the research subject, we constructed an objective function and constraints based on spatial and non-spatial structure indexes. Selective harvesting and replanting were employed as optimization measures, and experiments were conducted on five circular plots to compare the performance of multi-agent deep reinforcement learning with that of multi-agent reinforcement learning. To account for the dynamic characteristics of stand structure, we further integrated structure prediction with multi-agent deep reinforcement learning for dynamic optimization across the five plots.

Results: The results indicate that multi agent deep reinforcement learning consistently outperformed multi agent reinforcement learning across all plots. For the initial objective function values of each plot (0.3501, 0.3799, 0.3982, 0.3344, 0.4294), the optimized results obtained through multi agent deep reinforcement learning (0.5378, 0.5861, 0.5860, 0.5130, 0.6034) were significantly superior to the maximum objective function values achieved by multi agent reinforcement learning (0.5302, 0.5369, 0.5766, 0.5014, 0.5906). Furthermore, the dynamic optimization results incorporating structure prediction demonstrate that all plots progressively approached an ideal stand condition over multiple optimization cycles (0.5718, 0.6101, 0.6455, 0.5863, 0.6210), leading to a more balanced stand structure and improved long-term stability.

Discussion: This study proposes a novel stand structure optimization method that integrates multi agent deep reinforcement learning with structure prediction, providing theoretical support and practical guidance for the sustainable management of Pinus yunnanensis secondary forests.

1 Introduction

Secondary forests often face challenges such as unstable stand structure, reduced biodiversity, increased risk of forest fires, and susceptibility to natural disturbances, including insect infestations, diseases, and wildfires (Lei et al., 2008; Ding and Zang, 2021; Zaizhi, 2001). To enhance the stability and sustainability of secondary forests, stand structure optimization has become a key technical approach in forest management and planning, providing essential support for their scientific management.

Selective harvesting is a crucial measure for optimizing stand structure and has received widespread attention (Dong et al., 2022; Dongsheng et al., 2020; Chi et al., 2019; Dong et al., 2020). By removing trees with limited growth potential and weak competitiveness, the growth environment and resource allocation of the remaining trees can be improved, thereby optimizing stand structure.

Common stand structure optimization algorithms include heuristic methods such as particle swarm optimization (PSO) (Wu et al., 2022), Monte Carlo algorithm (Haight and Travis, 1997; Boston and Bettinger, 1999), and genetic algorithms (GA) (Jianming et al., 2017; Okasha and Frangopol, 2009; Fotakis et al., 2012). PSO provides certain advantages in global search but is prone to local optima in complex problems; Monte Carlo methods explore the solution space through random sampling but often suffer from low computational efficiency; GA performs well in handling nonlinear problems but typically requires many iterations to converge. Overall, although these methods can address stand structure optimization tasks, they generally face limitations such as high computational cost, susceptibility to local optima, and insufficient solution efficiency. In our previous study, we applied deep reinforcement learning to improve the efficiency and accuracy of multi-objective stand structure optimization. By modeling tree-felling decisions as agent actions and incorporating neural networks with experience replay for stable training, this approach achieved superior optimization results compared with traditional heuristic algorithms and conventional reinforcement learning methods across multiple plots of Pinus yunnanensis secondary forests (Zhao et al., 2024).

A single selective harvesting measure can only reduce competition and adjust stand density, but it cannot restore species diversity or fill the spatial gaps created by harvesting; therefore, it is insufficient to achieve comprehensive optimization of stand structure. On this basis, replanting measures, namely the planting of native tree seedlings in appropriate locations, should be implemented to achieve overall optimization of stand structure. Common strategies for selecting replanting location include the Voronoi diagram method (Wang et al., 2019), the maximum Delaunay triangulation area method (Chunyan and Jiping, 2017), and the Kriging interpolation method (Jian et al., 2018). In general, these methods identify relatively sparse areas within the stand as potential replanting sites using different algorithms. However, they have certain limitations: on one hand, the replanting locations are relatively fixed and lack flexibility, which may lead to suboptimal replanting outcomes; on the other hand, these methods often overlook the competitive interactions between the replanting trees and neighboring trees, potentially increasing resource competition within the stand and affecting the optimization of stand structure.

Building on this foundation, our research team applied multi-agent reinforcement learning to integrate selective harvesting and replanting, using multiple agents for collaborative optimization. Compared to single selective harvesting or replanting, multi-agent reinforcement learning offers advantages such as improved harvesting effectiveness and more flexible replanting locations, providing high adaptability and variability (Xuan et al., 2024, 2023). However, for such complex optimization problems, the trial-and-error cost in reinforcement learning increases, leading to unstable training and poor generalization capability. In contrast, multi-agent deep reinforcement learning not only retains the collaborative optimization advantages of multi-agent reinforcement learning but also exhibits superior computational efficiency. It achieves higher solution stability and efficiency when handling complex problems, along with enhanced generalization capability (Waschneck et al., 2018; Ning et al., 2023; Gronauer and Diepold, 2022; Shen et al., 2022).

Despite significant progress in current stand structure optimization research (Olsthoorn et al., 1999; Chen et al., 2023; Zhang et al., 2024), existing studies primarily focus on optimizing the present stand condition while overlooking the dynamic changes in stand structure. Research on optimizing dynamic stand structures remains relatively scarce (Na, 2019). As climate change, ecological shifts, and increasing complexities in forest management demand more adaptive strategies, dynamic stand structure optimization—optimization from a long-term perspective—will become increasingly important. Therefore, integrating scientific prediction with effective optimization methods to enhance the sustainability and adaptability of future stand management has emerged as a key challenge in contemporary forestry research.

Structure prediction is another important field in ecology and forest management. By analyzing existing stand data, it forecasts future stand growth trends and structural evolution.

Extensive research has been conducted on predicting stand variables such as DBH (Gyawali et al., 2015; Bohora and Cao, 2014), tree height (Lee et al., 2024; Siipilehto et al., 2023), crown width (Raptis et al., 2018; Sánchez-González et al., 2007), and crown length (Mengying et al., 2021; Sattler and LeMay, 2011). General growth models (Jiazheng et al., 2021) offer biologically interpretable insights, mixed-effects models (Chang and Fan, 2024) capture both population trends and individual variation, and machine learning methods such as random forests (Xiaonan et al., 2024) excel in nonlinear modeling and predictive accuracy. These approaches have all achieved promising results. Particularly in forest management and resource planning, these models provide critical scientific support for stand management. However, existing studies on structure prediction mainly focus on the interactions and trend predictions of multiple tree attributes (Ling-bo and Zhao-gang, 2011; Xi et al., 2015), with relatively little attention given to their overall impact on stand structural evolution (Jiping et al., 2020). Integrating prediction with optimization not only enhances our understanding of the dynamic changes in forest ecosystems but also provides scientific guidance for dynamic stand structure optimization, ultimately improving the long-term effectiveness of forest management.

In summary, although existing studies have made significant progress in stand structure optimization, they primarily focus on static improvements and fail to adequately address the dynamic nature of stand development over time. Moreover, heuristic algorithms and traditional reinforcement learning methods suffer from limitations in computational efficiency and generalization ability, restricting their applicability in long-term, complex ecosystem management. To bridge these gaps, this study proposes an innovative approach that integrates multi-agent deep reinforcement learning with stand structure prediction, focusing on the secondary forests of Pinus yunnanensis. By incorporating dynamic prediction into selective harvesting and replanting measures, we aim to achieve dynamic optimization of stand structure, not only enhancing the current structural condition but also improving long-term stability and sustainability.

2 Materials and methods

2.1 Study areas

The study area is located in the Cangshan region of the Dali Bai Autonomous Prefecture, Yunnan Province. Pinus yunnanensis is a typical pioneer and dominant species in this region, playing a key role in water conservation, soil and water preservation, and the maintenance of biodiversity. However, due to historical overexploitation, secondary Pinus yunnanensis forests in this area generally exhibit simple stand structures and poor stability, making them a focus and challenge for sustainable forest management. Therefore, conducting structural optimization studies on this typical forest type is of significant theoretical and practical importance for achieving precise improvement of regional forest ecosystems. The study area is located in Cangshan, Dali, Yunnan Province, southwestern China, spanning 25°34^′ ∼ 26°00^′N, 99°55^′ ∼ 100°12^′E, with a total area of approximately 293 km² and an elevation range of 1966–4122 m. The region has a plateau monsoon subtropical climate characterized by mild and stable weather, ample sunlight, small annual temperature variations, and large diurnal temperature fluctuations, with an annual average temperature of 16.1°C. The prevailing wind direction is the southwest monsoon (Shuai et al., 2024). Annual precipitation is abundant, reaching 861.1 mm, with distinct dry and wet seasons. Rainfall is concentrated from May to October, accounting for 83% of the total annual precipitation. The predominant soil type in the area is Hyperdystric Clayic Ferralsol (Ferric). Pinus yunnanensis is the primary tree species, and the associated tree species in the canopy layer include Pinus armandii Franch., Betula alnoides Buch.-Ham. ex D. Don, Quercus acutissima Carruth., and Quercus variabilis Blume. The understory shrub layer includes species such as Vaccinium bracteatum Thunb., Rhododendron microphyton Franch., Gaultheria griffithiana Wight, Eurya nitida Korthals, and Ternstroemia gymnanthera (Wight & Arn.) Bedd (Figure 1).

2.2 Study site and data collection

When establishing standard sample plots, circular plots offer advantages over traditional ones, such as easier setup and positioning in complex terrains and a smaller edge effect for the same area (Packalen et al., 2023). Generally, common square plots are typically set at 20m×20m as the initial size, which translates to a circular plot with a radius of approximately 11.29m. To study the structural characteristics and optimization methods of secondary Pinus yunnanensis forests at different scales, this research established circular plots of varying radii based on topographic conditions and plot accessibility, in accordance with predefined rules for standard plot radius division.

Based on the terrain conditions and stand characteristics, 11 fixed circular standard plots with radii ranging from 12 to 35 meters were established at elevations between 2100 and 2400 meters on Cangshan Mountain (Packalen et al., 2023). The geographical coordinates, elevation, slope, aspect, and plot radius of each plot were measured and recorded. For each circular standard plot, all living trees with a diameter at breast height of at least 5cm were individually measured. The species, relative coordinates, DBH, tree height, crown width, and other basic tree factors were recorded for each tree. The relative coordinates of each tree at the base were accurately measured using a total station (GTS-2002). Additionally, plots with better site conditions, P1-P5, were selected as the experimental plots for simulation optimization, with the basic plot information provided in Table 1.

Table 1

Table 1. Basic information of the sample plots.

2.3 Determination of spatial structure units and edge correction

This study employed the Voronoi diagram method to determine spatial relationships among trees (Liu et al., 2023). Centered on a reference tree, the Voronoi diagram method accurately captures tree adjacency relationships while effectively reflecting their horizontal distribution pattern. During data processing, Voronoi diagrams were generated using R 4.2.0, with each polygon representing a spatial structural unit formed by a tree and its neighboring trees. To minimize errors in calculating spatial structure indexes caused by edge trees being fragmented at the plot boundary, this study adopted the buffer zone method. The plot boundary was contracted inward by 2 m to create a buffer zone (Von Gadow et al., 2003). When computing spatial structure indexes, trees within the buffer zone were only considered as neighboring trees for constructing spatial structural units and were not used as reference trees.

2.4 Stand structure indexes

Quantifying stand structure is a fundamental aspect of stand structure optimization. In this study, spatial structure was set as the primary objective, while non-spatial structure served as a constraint. The selected non-spatial structure indexes included tree diameter classes, number of species, canopy density, harvesting intensity, and planting density. The selected spatial structure indexes included the uniform angle index, complete mingling, crown competition index, stratification index, and neighbourhood comparison. Among these, the uniform angle index describes the horizontal distribution pattern of trees, complete mingling represents the degree of tree species segregation, the crown competition index quantifies competition pressure among trees, the stratification index characterizes the vertical distribution pattern, and neighborhood comparison measures the degree of size differentiation among trees.

2.4.1 Non-spatial structure indexes

2.4.1.1 Tree diameter classes

Trees are classified into different categories based on their DBH, with a greater number of diameter classes indicating better stand growth. In the optimization process, it is required that the diversity of tree diameter classes remains consistent before and after optimization. In this study, tree diameter classification starts from a DBH of 6 cm, with a 2 cm interval for each diameter class (Equation 1).

\begin{array}{l} D = D_{0} & (1) \end{array}

Where D₀ represents the number of diameter classes of trees within the stand before harvesting, and D represents the number of diameter classes of trees within the stand after harvesting.

2.4.1.2 Number of species

During the optimization process, tree species diversity must be preserved, and no species should be artificially eliminated from the stand. It is required that the tree species diversity remains consistent before and after optimization to ensure that no species disappear (Equation 2).

\begin{array}{l} T = T_{0} & (2) \end{array}

Where T₀ denotes the initial number of tree species, while T indicates the number of tree species after harvesting.

2.4.1.3 Canopy density

A healthy forest requires the canopy to form a continuous cover. Generally, a canopy density of no less than 0.7 is considered indicative of continuous forest cover (Equation 3).

\begin{array}{l} C d \geq 0.7 & (3) \end{array}

2.4.1.4 harvesting intensity

harvesting intensity determines whether the stand’s growth condition remains favorable after optimization. According to harvesting requirements, the amount of selective harvesting should be less than the growth increment. Research indicates that the harvesting intensity of Pinus yunnanensis secondary forests should be controlled within 35% (Su et al., 2010; Han et al., 2011) (Equation 4).

\begin{array}{l} N \geq N_{0} (1 - 35 %) & (4) \end{array}

Where N₀ represents the total number of trees before harvesting, while N represents the total number of trees after harvesting.

2.4.1.5 Planting density

The planting density is a key factor influencing the effectiveness of replanting. Previous studies have shown that the optimal planting density for Pinus yunnanensis ranges from 1667 to 3333 trees per hectare (Zhang et al., 2023). After replanting optimization, the stand density should fall within the range of [1,667, 3,333] trees per hectare (Equation 5).

\begin{array}{l} 1667 \leq P D \leq 3333 & (5) \end{array}

2.4.2 Spatial structure indexes

2.4.2.1 Neighborhood comparison (U)

Neighborhood comparison (Aguirre et al., 2003) is used to describe the degree of size differentiation and competition among trees. It refers to the proportion of neighboring trees with a DBH larger than that of the reference tree among neighboring trees. The expression is given as (Equation 6):

\begin{array}{l} U_{i} = \frac{1}{n} \sum_{j = 1}^{n} k_{i j} & (6) \end{array}

U_irepresents the neighborhood comparison for reference tree i, If the diameter at breast height of neighboring tree j is greater than that of reference tree i, then k_ij = 1, otherwise, k_ij = 0. A smaller U_i indicates a greater dominance of the reference tree. The value of U_i can fall into five intervals: 0, (0, 0.25], (0.25, 0.5], (0.5, 0.75], and (0.75, 1], corresponding to the reference tree being in dominant, sub-dominant, intermediate, disadvantaged, and absolutely disadvantaged status within the stand, respectively.

2.4.2.2 Crown competition index

The crown competition index (Jianming, 2017) is a method used to describe the degree of competition among trees by calculating the crown overlap area based on tree characteristics such as crown width and crown length, thereby reflecting the competitive pressure during tree growth. The expression is given as (Equations 7–10):

\begin{array}{l} C I_{i} = \frac{1}{Z_{i}} \times \sum_{j = 1}^{n} A O_{i j} \times \frac{L_{j}}{L_{i}} & (7) \end{array}

CI_i represents the crown competition index for reference tree i, and Z_i represents the crown projection area of reference tree i. L_j= H_j × CW_j × CL_j (height of competing tree j × crown width of competing tree j × crown length of competing tree j), L_i= H_i× CW_i× CL_i (height of reference tree i × crown width of reference tree i × crown length of reference tree i). AO_ij represents the crown overlap area between reference tree i and competitor tree j. If there is no overlap, AO_ij = 1. When there is overlap,

\begin{array}{l} S_{0} = \frac{C W_{i}^{2}}{2} \sum_{j = 1}^{n} a r c c o s (\frac{q_{j}^{2}}{2 C W_{i}^{2}} - 1) - \frac{1}{4} q_{j} \sqrt{4 C W_{i}^{2} - q_{i}^{2}} & (8) \end{array}

\begin{array}{l} S_{1} = \frac{1}{2} \sum_{j = 1}^{n} {[C W_{j}^{2} a r c c o s (1 - \frac{4 C W_{i}^{2} - q_{i}^{2}}{2 C W_{j}^{2}}) - \frac{\sqrt{4 C W_{i}^{2} - q_{j}^{2}}}{2}] \times \sqrt{4 C W_{i}^{2} - (4 C W_{i}^{2} - q_{j}^{2})}} & (9) \end{array}

\begin{array}{l} A O_{i j} = S_{0} + S_{1} & (10) \end{array}

S₀ represents the total shaded area of reference tree i by n competitor trees, and S₁ represents the total shaded area of n competitor trees by reference tree i. $q_{j} = \frac{L_{i j}^{2} - (C W_{j}^{2} - C W_{i}^{2})}{L_{i j}},$ L_ij represents the distance between competitor tree j and reference tree i, CW_i represents the crown width of reference tree i, CW_j represents the crown width of competitor tree j, and n represents the number of competitor trees.

2.4.2.3 Stratification index

The stratification index (Zhao et al., 2024; Zhou et al., 2022) reflects the vertical distribution pattern of trees and the diversity of stand structure. It is an extension of the storey index, incorporating the influence of terrain on forest stratification. The expression is given as (Equations 11–13):

\begin{array}{l} S_{i} = \frac{z_{i}}{3} \times \frac{1}{n} \sum_{j = 1}^{n} (1 - \frac{| F L_{i} - F L_{j} |}{m a x (| F L_{i} - F L_{j} |, 1)}) & (11) \end{array}

\begin{array}{l} F L_{i} = {\begin{array}{l} - 1, H_{i} \leq \frac{1}{3} H_{d} \\ 0, \frac{1}{3} H_{d} \leq H_{i} \leq \frac{2}{3} H_{d} \\ 1, H_{i} \geq \frac{2}{3} H_{d} \end{array} & (12) \end{array}

\begin{array}{l} H_{d} = \frac{1}{⌊ 100 A ⌋} \sum_{i = 1}^{⌊ 100 A ⌋} (H_{s o r t (i)} + E_{s o r t (i)}) & (13) \end{array}

S_irepresents the stratification index for reference tree i, z_i denotes the number of layers within the spatial structure unit to which the reference tree i belongs. FL_i indicates the classification of reference tree i in the vertical stratification. H_i represents the height of reference tree i, while H_d denotes the dominant height. A stands for the per hectare plot area, and H_sort₍_i₎ is the height of the i-th tree among the tallest ⌊100A⌋ trees per hectare, and E_sort₍_i₎ indicates the relative elevation of the i-th tree among these ⌊100A⌋ trees. The closer the stratification index is to 1, the more complex the vertical stratification of the stand.

2.4.2.4 Complete mingling

Complete mingling (Sheng et al., 2023) introduces the Simpson index into the traditional mingling index to enhance the differentiation of tree species diversity. It is used to describe the degree of tree species segregation while also accounting for species diversity. The expression is given as (Equation 14):

\begin{array}{l} M c_{i} = \frac{M_{i}}{2} [1 - \frac{1}{{(n + 1)}^{2}} \sum_{j = 1}^{s_{i}} n_{j}^{2} + \frac{n_{i}}{n}] & (14) \end{array}

Mc_i represents the complete mingling of reference tree i. n_j is the number of different species among the neighboring trees, n_j is the number of trees of the j-th species among the neighboring trees, and s_iis the number of species within the spatial structure unit to which reference tree j belongs. M_i represents the mingling degree of reference tree i $M_{i} = \frac{1}{n} \sum_{j = 1}^{n} v_{i j}$ , When the reference tree i and neighboring tree j are of the same species, v_ij = 0; otherwise, v_ij = 1. The value of Mc_i can fall into five intervals: 0, (0, 0.25], (0.25, 0.5], (0.5, 0.75], and (0.75, 1], corresponding to zero mixing, low mixing, moderate mixing, high mixing, and complete mixing, respectively.

2.4.2.5 Uniform angle index (W)

The uniform angle index (Zhang et al., 2018) is used to describe the spatial distribution pattern of trees. It is defined as the proportion of α angles (the smaller angles between neighboring trees) that are less than the standard angle $α_{0} (α_{0} = \frac{360^{\circ}}{n + 1})$ out of a total of angles formed. Its expression is (Equation 15):

\begin{array}{l} W_{i} = \frac{1}{n} \sum_{j = 1}^{n} z_{i j} & (15) \end{array}

W_i represents the uniform angle index for reference tree i. When the j-th α angle is smaller than the standard angle α₀, z_ij = 1; otherwise, z_ij = 0. The value of W_i can fall into five intervals: 0, (0, 0.25], (0.25, 0.5], (0.5, 0.75], and (0.75, 1], corresponding to absolutely uniform, uniform, random, non-uniform, and conmplete non-uniform distributions, respectively. The ideal range for the mean uniform angle index in a stand is between [0.475, 0.517].

2.5 Selective harvesting Strategy

Random selection (Tang et al., 2004), tree homogeneity index (Yitong, 2019), and spatial competition (Zhang et al., 2019) are common methods for determining felling decisions. In the random selection method, trees are randomly selected as candidates for felling from the initial stand. The tree homogeneity index-based method calculates a comprehensive index Li for each tree using spatial structure parameters and ranks the trees in ascending order to determine the felling candidates. The spatial competition-based method evaluates trees based on horizontal spatial patterns and competition pressure, selecting trees with a greater difference between the uniform angle index and 0.496, a higher neighborhood comparison value, and a larger crown competition index as felling candidates. In our previous research, we conducted an experimental comparison of these three methods and found that random selection was best suited for integration with the deep reinforcement learning algorithm Zhao et al. (2024). Therefore, in this study, random selection is chosen as the preferred method for the felling optimization process.

2.6 Replanting strategy

2.6.1 Planting location

The maximum Delaunay triangulation area method is a planting location determination approach based on Delaunay triangulation. In a Delaunay triangulation network, edge lengths represent the distances between neighboring trees, while nodes correspond to individual trees. Planting locations are determined by identifying the incenter of the largest triangle formed by the nodes, thereby reflecting both the presence of canopy gaps and the overall stand distribution pattern. To account for the influence of surrounding neighboring trees on the growth of replanted trees, the replanting foreground index (RFI) (Xuan et al., 2024) is incorporated to identify planting locations that have a greater impact on stand structure optimization. This method comprehensively considers various factors affecting the growth of replanted trees, thereby improving the accuracy and effectiveness of planting location determination and enhancing the efficiency of stand structure adjustment and optimization.

The calculation formula for the RFI is as follows (Equation 16):

\begin{array}{l} R F I_{i} = \frac{\frac{1 + D A A_{i}}{δ_{D A A}} \cdot \frac{1 + M c_{i}}{δ_{M c}} \cdot \frac{1 + U_{i}}{δ_{U}}}{\frac{1 + C I_{i}}{δ_{C I}}} & (16) \end{array}

RFI_i represents the replanting foreground index of the replanted tree i, DAA_i refers to the area of the Delaunay triangulation element in which tree i is located. Mc_i corresponds to the complete mingling of tree i, while U_i denotes its neighborhood comparison, and CI_i indicates its crown competition index. Moreover, δ_DAA, δ_Mc, δ_U and δ_CI represent the standard deviations of their respective structural parameters.

2.6.2 Planting number

Relevant studies have shown that the optimal planting density range for Pinus yunnanensis is 1667–3333 trees per hectare. In this study, 3333 trees per hectare was selected as the upper limit for stand density after replanting. Based on the plot area, the maximum number of trees was determined for each plot: P1, P2, P3, P4, and P5 had upper limits of 1284, 1076, 418, 377, and 941 trees, respectively.

2.6.3 Spatial configuration of replanting

In mixed forests, the proportion and spatial arrangement of replanted tree species directly influence the degree of species mingling, while variations in DBH, H, and CW among different species substantially affect interspecific competition. In this study, replanting optimization simultaneously considered the spatial distribution and size structure of trees to achieve a more balanced species composition and stand structure. To promote the positive succession of secondary forests and enhance the mingling index, the main tree species widely distributed in the Cangshan region were selected. The seven chosen species—Pinus yunnanensis, Pinus armandii, Quercus acutissima, Vaccinium bracteatum, Camellia sinensis, Betula alnoides, and Ternstroemia gymnanthera—are either dominant or native tree species in the study area and represent the typical species composition and ecological characteristics of local secondary forests. Equal proportions were applied in the simulation to simplify the modeling framework and highlight species diversity and ecological complementarity. The replanted trees were initialized with an average DBH of 5 cm and an age of 5 years. Due to the limited availability of growth models for some species, models of ecologically similar species (Xi et al., 2015; Luo, 2021; Jiazheng et al., 2021) were adopted to estimate tree height, crown width, and crown length, and the results were subsequently validated using existing datasets. The configuration of replanted trees is shown in Table 2.

Table 2

Table 2. Configuration of replanted trees.

2.7 Stand structure prediction

2.7.1 Models

Based on the data characteristics of different tree factors and the requirements of the prediction tasks, this study selected four different machine learning models to ensure predictive accuracy and reliability. The CNN-PSO (Ma et al., 2022) model is suitable for handling continuous variables with complex nonlinear relationships; its convolutional layers can effectively capture latent spatial patterns among features, and thus it was used to predict age and crown length. The Random Forest (RF) model (Xiaonan et al., 2024), with its strong anti-overfitting capability and ability to handle high-dimensional data, can robustly manage complex interactions among tree growth factors, making it suitable for predicting tree height and crown width. The Multilayer Perceptron (MLP) (Kim et al., 2025), due to its powerful function approximation capability, was employed to model the growth of diameter at breast height (DBH), a complex nonlinear regression problem. The Gradient Boosting Decision Tree (GBDT) (Nhat-Duc and Van-Duc, 2023), which excels in classification tasks—particularly with imbalanced data and combining multiple weak classifiers—was used to predict the probability of tree mortality, a binary classification problem. This targeted model selection strategy aims to maximize the accuracy and robustness of each prediction task.

2.7.1.1 CNN-PSO

CNN consists of an input layer, output layer, convolutional layers, pooling layers, and fully connected layers. Its core concept is to automatically learn local features through convolutional layers, eliminating the need for manual feature extraction required by traditional methods. Additionally, pooling layers perform downsampling on the features, reducing computational complexity while enhancing the model’s robustness.

PSO is a global optimization algorithm that simulates swarm behavior, where each particle represents a potential solution. The particles continuously move through the solution space, searching for the optimal solution. PSO has strong global search capabilities and can efficiently identify high-quality solutions.

The CNN-PSO was adopted in this study, where the global search ability of PSO was utilized to optimize critical hyperparameters, such as the number of output channels in the convolutional layers, learning rate, and batch size. Traditional manual tuning is often constrained by experience and prone to getting stuck in local optima. By applying PSO to CNN hyperparameter optimization, the optimal parameter combination can be automatically identified, enhancing model performance while significantly reducing computational resources and time.

Figure 1

Map showing the Dali Bai Autonomous Prefecture alongside a topographic map with elevation ranging from 1,338 to 4,071 meters. Sample plots P1 to P11 are marked with different colored shapes. The inset map highlights Dali City and the Prefecture.

Figure 1. Location of the study area.

In the model design, two convolutional layers were used to extract features from the input data, each followed by a batch normalization layer to accelerate training and improve stability. Then, pooling layers were applied to reduce data dimensionality, alleviating computational burden and enhancing the model’s adaptability to local features. After convolution and pooling, the data was flattened and processed through fully connected layers before producing the final prediction output. To prevent overfitting, a Dropout layer was incorporated to randomly drop some neurons, encouraging the model to learn diverse features and thereby improving its predictive capability on new data. As shown in Figure 2.

Figure 2

Diagram of a neural network architecture using Particle Swarm Optimization parameters. It shows an input layer connected to two convolutional blocks, each with a convolutional layer, batch normalization layer, and ReLU activation. The first block feeds into a pooling layer, then into the second block. The second block leads to a flatten layer, followed by a fully connected layer with ReLU, and finally a dropout layer, ending at the output layer.

Figure 2. The architecture of the CNN-PSO model.

2.7.1.2 Random Forest

RF is an ensemble learning algorithm based on the Bagging method and decision trees, using decision trees as its fundamental units. It constructs multiple decision trees and combines their predictions to enhance overall performance. During training, each tree is built using a randomly sampled subset of the original data, and feature selection at each node split is performed randomly. This approach reduces overfitting and improves the model’s generalization ability. By integrating the outputs of multiple decision trees, random forest effectively captures complex feature interactions, exhibits strong noise resistance, and achieves high prediction accuracy.

To further enhance the predictive performance of the model, this study employed ten-fold cross-validation and grid search for hyperparameter optimization. Ten-fold cross-validation involves partitioning the dataset into ten subsets, using nine for training and the remaining one for validation in each iteration. This method effectively mitigates biases caused by data partitioning and improves the model’s stability and generalization ability. Grid search systematically explores different hyperparameter combinations to optimize the performance of the random forest model. The tested hyperparameters include the number of trees (50, 100, 200), tree depth (10, 20, 30), the minimum number of samples required for node splitting (2, 5, 10), and the minimum number of samples required for leaf nodes (1, 2, 4). As shown in Figure 3.

Figure 3

Flowchart illustrating a machine learning process: Input data undergoes preprocessing, followed by hyperparameter optimization using grid search and tenfold cross-validation. Multiple decision trees are created and averaged for aggregation, producing the final result.

Figure 3. Framework of the Random Forest Model.

2.7.1.3 Multilayer perceptron

MLP is a deep feedforward neural network composed of an input layer, multiple hidden layers, and an output layer. Each neuron in a layer is fully connected to the neurons in the previous layer. Input data undergoes weighted summation and activation function processing at each layer, ultimately producing the final prediction output. MLP possesses strong function approximation capabilities, enabling it to capture complex patterns and features in data through multiple hidden layers, thereby achieving high predictive accuracy.

In the model design, this study utilized MLP as the base model and constructed a Stacked-MLP model by stacking three MLPs. The prediction outputs of each MLP were used as inputs for a regressor, which generated the final prediction. To balance computational cost and model performance, the hyperparameters of the regressor were set as follows: learning rate (0.0001–0.1), number of iterations (100–500), and batch size (16, 32, 64, 128). As shown in Figure 4.

Figure 4

Diagram showing a neural network and stacking ensemble. On the left, a multilayer perceptron (MLP) consists of an input layer, two hidden layers, and an output layer. On the right, the stacking ensemble includes three MLPs, which make initial predictions. These are combined and fed into a regressor, resulting in a final prediction.

Figure 4. The architecture of the Stacked-MLP model.

2.7.1.4 Gradient boosting decision tree

GBDT is an ensemble learning-based classification algorithm that incrementally trains multiple weak classifiers, combining their weighted predictions to obtain the final classification result. Each new model is optimized to correct the errors made by the previous model, focusing on misclassified samples and gradually adjusting the model parameters to improve classification performance. GBDT efficiently handles complex nonlinear relationships and can automatically identify interactions between features, reducing the need for extensive manual feature engineering. It offers high predictive accuracy and strong flexibility, making it a powerful tool for classification tasks.

To enhance model performance, improve stability, and strengthen generalization ability, this study employed grid search to explore the hyperparameter space and identify the optimal combination of parameters. The selected hyperparameters include learning rate (0.01, 0.1, 0.2), number of trees (50, 100, 200), and tree depth (10, 20, 30). As shown in Figure 5.

Figure 5

Flowchart depicting a machine learning process. It starts with “Input Data,” moves through “Preprocessing” and “Grid Search.” Two decision trees labeled “Fitting Residuals” are shown, leading to a “Weighted Sum,” and ending with the “Result.

Figure 5. Framework of the Gradient Boosting Decision Tree Model.

2.7.2 Prediction

Stand structure prediction involves forecasting parameters such as diameter at breast height, tree height, age, crown width, crown length, and mortality rate to understand the dynamic changes in stand structure. This enables dynamic optimization of stand structure and provides decision-making support for forest management, resource conservation, and ecological restoration.

We selected various indicators to construct the model, including age (AGE), diameter at breast height (DBH,1/DBH,DBH²), tree height (H), crown width (CW), crown length (CL), number of trees per hectare (NT), stand density index (SDI), slope (SLO), aspect (ASP), height-to-diameter ratio (HDR), the sum of basal area larger than the target tree (BAL), and Hegyi competition index (HCI). To eliminate multicollinearity when building different predictive models, independent variables with a variance inflation factor (VIF) greater than 10 were excluded.

The diameter growth model was developed using the Stacked-MLP algorithm, with AGE, ASP, NT, SLO, SDI, BAL, and DBH² selected as feature variables. The model’s target variable was set as the natural logarithm of diameter growth squared plus one, ln(DGI + 1). Since tree diameter growth occurs over long periods, shorter prediction intervals may fail to capture significant growth changes. Therefore, a five-year prediction cycle was used in the model.

The RF algorithm was used for tree height and crown width prediction models. The selected feature variables for tree height prediction were AGE, ASP, DBH, and SLOPE, with H as the target variable. For crown width prediction, the feature variables included AGE, 1/DBH, CL, ASP, SDI, HDR, and NT, with CW as the target variable.

The CNN-PSO was used for age and crown length prediction models. The feature variables for age prediction were 1/DBH, H, ASP, SLO, NT, and SDI, with AGE as the target variable. For crown length prediction, the selected feature variables were AGE, 1/DBH, ASP, SDI, HDR, and NT, with CL as the target variable.

Mortality prediction is a binary classification problem, and the GBDT model was chosen for this task. To improve prediction accuracy, the classification threshold was initially set at 0.5; however, this value is only applicable when the number of surviving and dead trees in the stand is approximately equal. In reality, mortality is a low-probability event, and cases where mortality and survival are equal are rare. Therefore, this study incorporated the HCI as an additional criterion. Trees with a mortality probability greater than 0.5 and experiencing high competition pressure (HCI > 0.75) were classified as dead. The selected feature variables included 1/DBH, H, BAL, and HCI, with tree mortality status (STATE) used as the target label.

Since the stand structure prediction model simultaneously predicts multiple stand factors, redundancy in feature variable selection is inevitable, which may lead to internal inconsistencies within the model. To address this, predictions were conducted sequentially in the following order: age, diameter at breast height, tree height, mortality, crown width, and crown length. This ensures that the feature variables used in all prediction models are updated accordingly during the prediction process.

The VIF values of the feature variables and the prediction accuracy of each model are presented in Tables 3–5.

Table 3

Table 3. VIF values of feature variables in different prediction models.

Table 4

Table 4. Evaluation metrics of selected prediction models.

Table 5

Table 5. Evaluation metrics for the mortality prediction model.

2.8 Optimization model of stand structure

2.8.1 Constrain

After optimization adjustments, the values of each sub-objective must not fall below their pre-optimization levels to ensure that the diversity of stand spatial structure does not decline. This means that the horizontal distribution pattern of the stand should become closer to a random distribution, the degree of species mingling should increase, the richness of vertical structure should be enhanced, and both size differentiation and competition pressure should be correspondingly reduced. During the optimization process, the number of tree diameter classes and species should not decrease, harvesting intensity should be controlled within 35%, and canopy density should not fall below 0.7. The number of replanted trees should remain within the stand density range, and after replanting, both the degree of mingling and the horizontal distribution pattern should be improved compared to prereplanting conditions (Equation 17).

\begin{array}{l} s . t . {\begin{cases} \bar{M c_{1}} \geq \bar{M c_{0}} \\ \bar{S_{1}} \geq \bar{S_{0}} \\ \bar{U_{1}} \leq \bar{U_{0}} \\ \bar{C I_{1}} \leq \bar{C I_{0}} \\ | {\bar{W}}_{1} - 0.496 | \leq | {\bar{W}}_{0} - 0.496 | \\ D_{1} = D_{0} \\ T_{1} = T_{0} \\ C d \geq 0.7 \\ N_{1} \geq N_{0} (1 - 35 %) \\ | {\bar{W}}_{2} - 0.496 | \leq | {\bar{W}}_{1} - 0.496 | \\ \bar{M c_{2}} \geq \bar{M c_{1}} \\ 1667 \leq P D \leq 3333 \end{cases} & (17) \end{array}

$\bar{M c_{1}}$ , $\bar{S_{1}}$ , $\bar{U_{1}}$ , $\bar{C I_{1}}$ , $\bar{W_{1}}$ , $D_{1}$ , $T_{1}$ , $C d$ , $N_{1}$ respectively represent the values of complete mingling, stratification index, neighborhood comparison, crown competition index, uniform angle index, tree diameter classes, number of species, canopy density, and harvesting intensity after selective harvesting. $\bar{M c_{0}}$ , $\bar{S_{0}}$ , $\bar{U_{0}}$ , $\bar{C I_{0}}$ , $\bar{W_{0}}$ , $D_{0}$ , $T_{0}$ , $C d$ , $N_{0}$ represent the values of the initial stand for the following indicators. $\bar{M c_{2}}$ , $\bar{W_{2}}$ represent the values of after complete mingling and uniform angle index after replanting. $P D$ represents the stand density after selective redharvesting and replanting optimization.

2.8.2 Model construction

Stand structure optimization is a multi-objective problem that typically involves multiple interrelated goals. Since these objectives are often mutually constraining, it is difficult to achieve their individual optima simultaneously. Therefore, it is necessary to adopt an integrated approach and balance multiple objectives from the perspective of the overall stand. In the optimization process, the complete mingling and stratification index should be increased to enhance stand diversity and stability; the crown competition index and neighborhood comparison should be reduced to alleviate competition among individuals and prevent excessive dominance by a few large trees; and the uniform angle index should be maintained within the range close to random distribution to ensure a reasonable horizontal pattern. Accordingly, in this study, five spatial structure parameters—uniform angle index, neighborhood comparison, complete mingling, stratification index, and crown competition index—were incorporated into a multi-objective optimization model using a multiplicative–divisive approach to construct the objective function (Equation 18):

\begin{array}{l} m a x L = \frac{1}{n} \sum_{i = 1}^{N} \frac{\frac{1 + M c_{i}}{δ_{M c}} \cdot \frac{1 + S_{i}}{δ_{S}}}{\frac{1 + U_{i}}{δ_{U}} \cdot \frac{1 + C I_{i}}{δ_{C I}} \cdot \frac{1 + | W_{i} - 0.496 |}{δ_{| W_{i} - 0.496 |}}} & (18) \end{array}

W_i, Mc_i, S_i, U_i and CI_i represent the uniform angle index, complete mingling, stratification index, neighborhood comparison, and crown competition index of the reference tree, respectively. δ_W, δ_Mc, δ_S, δ_Wand δ_CI are the standard deviations of these structural indexes. N represents the total number of trees in the forest stand.

2.9 Solution algorithm

2.9.1 Multi-agent deep reinforcement learning

This study selected multi-agent deep Q-network (MADQN) as the solution algorithm. The MADQN model incorporates deep Q-network (DQN) and multi-agent Q-learning (MAQL). Each agent utilizes a deep neural network to approximate the Q-value function, enabling decision-making in large-scale and complex state spaces. By leveraging experience replay and target networks, the model achieves faster convergence, avoids overfitting, and ensures policy stability and generalization capability.

In the application of MADQN for stand structure optimization, selective harvesting and replanting serve as two key regulatory measures. Two agents, Agent1 and Agent2, were designed to achieve optimization. Each agent has its own tasks and objectives, interacting with the environment while also collaborating with each other. Through continuous learning, they work together to optimize the objective function of stand structure.Agent1 selects trees for selective harvesting based on the random selection method and adjusts its strategy according to the impact of tree removal on the objective function value. After harvesting, if the selected trees result in an increase in the objective function value, Agent1 receives a reward; otherwise, it is penalized. Through this process, Agent1 gradually learns how to select trees for harvesting, aiming to minimize unnecessary losses while maximizing the improvement of the stand’s objective function during the harvesting process.

Agent2 determines the number of replanted trees and their distribution based on the stand condition after Agent1 has completed selective harvesting and received its reward. Its goal is to compensate for gaps created by harvesting and introduce appropriate tree species to optimize stand structure, enhancing growth potential and ecological benefits. After harvesting, Agent2 utilizes the RFI to identify suitable planting locations. It then applies a curve trend-based approach (Xuan et al., 2024) to optimize the number of replanted trees. Specifically, it selects three evenly spaced replanting densities and evaluates their corresponding objective function values to establish a trend curve. Based on this trend, Agent2 adjusts the number of replanted trees by selecting new values with the same spacing before and after the current replanting density. This approach leverages curve monotonicity and extremum detection to assign rewards or penalties, refining the sampling strategy to gradually determine the optimal number of replanted trees. This ensures that under the given harvesting conditions, the replanting effect is maximized for optimal stand structure recovery.

Agent1 and Agent2 are interdependent and interact throughout the optimization process. Agent1’s selective harvesting decisions directly influence Agent2’s replanting strategy, while Agent2’s replanting results, in turn, affect the post-harvesting stand condition and consequently impact the objective function value. Through a reward and penalty mechanism, both agents continuously adjust their strategies, enabling a coordinated optimization process that dynamically balances selective harvesting and replanting to achieve the best possible stand structure.

The process of solving stand structure optimization using multi-agent deep reinforcement learning is illustrated in Figure 6 (Algorithms 1–3).

Figure 6

Flowchart illustrating decision-making for cutting and replanting agents. It includes frames for agent actions, state changes, environment constraints, and rewards or punishments. Arrows indicate transitions between frames, highlighting optimizing directions for action. The cutting agent receives different levels of rewards or punishments based on satisfying constraints, while the replanting agent selects replanting quantities, calculating outcomes with rewards and punishments. Both processes feed into a replay buffer connected to neural networks, emphasizing learning from actions to minimize loss.

Figure 6. MADQN for stand structure optimization.

2.9.2 Solution for dynamic stand structure optimization

During the subsequent optimization process, the selective harvesting and replanting agents continuously collaborate and interact, iteratively optimizing the current stand condition to achieve the most optimal stand structure. In each optimization cycle, the agents learn from environmental feedback, gradually approaching the structural characteristics of an ideal stand. However, real-world stand structures are influenced by multiple factors, and relying solely on selective harvesting and replanting may not directly achieve the desired stand structure. Therefore, a stand structure prediction model is incorporated as a crucial step to further enhance the accuracy and effectiveness of the optimization process.

The stand structure prediction model is used to forecast tree growth trends over a future period based on the current stand condition, as well as the potential changes after selective harvesting and replanting. After providing predictions on the current stand status, the agents further optimize the coordination between harvesting and replanting. At this stage, the selective harvesting and replanting agents not only rely on their original decision framework but also adjust the stand structure based on the prediction results. This process operates as a cyclic feedback mechanism, where each optimization generates a new stand condition, which in turn serves as the starting point for the next round of optimization. After each selective harvesting and replanting step, the updated stand condition is fed into the stand structure prediction model, and the model’s prediction results influence the next round of harvesting and replanting decisions. Through this approach, harvesting and replanting decisions are dynamically adjusted, ensuring that each step moves closer to an ideal stand structure. This iterative optimization process allows the model to identify the best regulatory strategies in a complex and dynamically changing stand environment. By avoiding a fixed strategy from the outset, the optimization process becomes more flexible and adaptive to stand dynamics.

The solution process for dynamic stand structure optimization is illustrated in Figure 7 (Algorithm 1).

Figure 7

Flowchart illustrating a process starting with “Begin” and “Initial Stand”. It proceeds to “Optimization using MADQN” with options for “Cutting” and “Replanting”. The decision point “Ideal Stand” leads to either “End” if yes, or calculates AGE, DBH, H, CW, CL, Mort if no. This loops back for prediction and optimization until achieving the ideal stand.

Figure 7. Dynamic optimization process flowchart.

2.9.3 Parameter settings

The parameter settings for the solution algorithm used in the experiment are shown in Table 6. The initial iteration count and maximum iteration count for the solution algorithm were set to 0 and 1000, respectively. During the optimization process, the stand structure at different time periods is abstracted into a sequence. At the beginning of each iteration, the selective harvesting agent starts at the initial state ( $s t a t e 1 = 0$ ), while the replanting agent starts at the final state ( $s t a t e 2 = 100$ ). Both agents interact with the environment and collaborate with each other to decide whether to move forward $(s t a t e 1 = s t a t e 1 + 1, s t a t e 2 = s t a t e 2 - 1)$ or move backward $(s t a t e 1 = s t a t e 1 - 1, s t a t e 2 = s t a t e 2 + 1) .$ The iteration ends when the selective redharvesting agent and the replanting agent meet ( $s t a t e 1 = s t a t e 2$ ). To compare the performance of MADQN and MAQL, both algorithms were set with the same hyperparameters ( $γ = 0.9$ , $l r = 0.01$ , $ϵ = 0.9$ ). Additionally, for MADQN, the experience replay buffer size was set to 10000, with a batch size of 32. A three-layer fully connected network was used, with each hidden layer containing 24 neurons. These parameter settings were obtained through multiple experiments and fine-tuning to achieve optimal results.

Table 6

Table 6. Parameter settings of the solution algorithm.

In structured forest management, neighborhood comparison, representing size differentiation and competition intensity, complete mingling, indicating species segregation, and uniform angle index, describing horizontal distribution patterns, are the three most important spatial structure indexes. Considering the limitations of Pinus yunnanensis secondary forest plots, $(U \leq 0.5, M c \geq 0.75, 0.475 \leq W \leq 0.517$ ) is selected as the ideal stand structure characteristic for dynamic stand structure optimization (Gangying et al., 2005, 2018).

3 Results

To verify the effectiveness of the multi-agent deep reinforcement learning solution in stand structure optimization, this study selected five standard plots with different densities and site conditions for simulation experiments. For optimizing the current stand condition, a comparative experiment was conducted between the MADQN and MAQL under the same selective harvesting and replanting methods to evaluate the optimization advantages of MADQN. In the dynamic optimization process, MADQN was integrated with stand structure prediction to enable dynamic adjustments and optimization of the stand structure over time.

3.1 Results of simulated harvesting optimization

3.1.1 Current stand structure optimization

As shown in Figure 8, after implementing the two optimization schemes for coordinated selective harvesting and replanting, the stand structure indexes in each plot improved to varying degrees while meeting the constraint conditions, effectively enhancing stand structure. The average uniform angle index in each plot slightly decreased its deviation from 0.496, indicating that the horizontal distribution pattern of the stands became more randomly distributed. The complete mingling index significantly increased across all plots, particularly because the initial mingling degree in each plot was extremely low, leaving ample room for improvement. Notably, in Plot P4, the increase reached 16602.09%. Additionally, the crown competition index decreased substantially in all plots, indicating that tree competition pressure was alleviated after optimization. The stratification index showed a moderate increase, suggesting an improvement in vertical structural complexity and a more diverse vertical distribution pattern. However, the neighborhood comparison showed minimal changes across all plots. This is likely due to the fact that the initial average neighborhood comparison values were already in a moderate growth state, limiting the potential for significant improvement.

Figure 8

Six bar charts compare parameter changes for five variables, labeled P1 to P5. Each chart contrasts changes between two methods, MAQL (orange) and MADQN (green). The top row charts show “Difference between W and 0.496” and “CI,” highlighting negative changes. The middle row illustrates “U” with changes slightly negative, and “P1-P3” over “P4-P5” depicting increases. The bottom row shows “Mc” with substantial increases, and “S” with moderate parameter changes. Each chart presents different scales of percent changes.

Figure 8. Changes in structure indexes after current stand optimization.

As shown in Table 7, in the current stand structure optimization, both MADQN and MAQL significantly improved the objective function values. However, in terms of overall improvement, MADQN consistently outperformed MAQL. The objective function values for plots P1 to P5 under MADQN optimization increased from 0.3501, 0.3799, 0.3982, 0.3344, and 0.4294 to 0.5378, 0.5861, 0.5860, 0.5130, and 0.6034, respectively—higher than the values achieved by MAQL (0.5302, 0.5369, 0.5766, 0.5014, and 0.5906). The improvement rate under MADQN reached 49.40%, exceeding the 44.58% achieved by MAQL. These results indicate that MADQN is more effective in optimizing stand structure, guiding it toward a more optimal target state.

Table 7

Table 7. Current stand structure optimization under different optimization schemes across various plots.

As shown in Figure 9, MADQN outperformed MAQL in terms of the number of iterations required for optimization. After different numbers of iterations, MADQN exhibited a faster increase in objective function values, especially in Plots P2 and P4. This indicates that MADQN, which utilizes deep neural networks to approximate the Q-function, can find optimal strategies more quickly during the optimization process compared to MAQL, which relies on table-based Q-learning. As a result, MADQN requires less extensive exploration and achieves higher learning efficiency. It is worth noting that in Plots P1 and P3, although MADQN achieved a higher objective function value, MAQL required slightly fewer iterations to converge. This suggests that stand density and site conditions influence convergence performance, and in certain cases, MAQL’s convergence efficiency is not necessarily inferior to MADQN. As shown in Figure 10, MADQN also demonstrated better overall performance in terms of computational efficiency. Across different plots, MADQN consistently maintained a lower or comparable runtime curve compared to MAQL, indicating superior time efficiency.

Figure 9

Five line graphs labeled P1 to P5 compare the performance of MAQL (black line) and MADQN (red line) over 400 iterations. Each graph shows function values increasing and stabilizing, with MADQN consistently reaching higher values than MAQL across all panels.

Figure 9. Convergence states of different optimised strategies in different plots.

Figure 10

Five line graphs showing the value of a function over running time for scenarios P1 to P5. Each graph compares two methods: MAQL (black line) and MADQN (red line). The x-axis represents running time in minutes, ranging from zero to twenty thousand, and the y-axis represents function values ranging from zero point three five to zero point six. In each graph, MADQN consistently achieves higher values than MAQL, particularly evident at early time stages.

Figure 10. Running time of different optimised strategies in different plots.

3.1.2 Dynamic stand structure optimization

In the dynamic stand structure optimization of five plots using multi-agent deep reinforcement learning combined with structure prediction, most stand structure indexes showed significant improvements after optimization. The uniform angle index for all plots fell within the ideal range of [0.475, 0.517], indicating that the horizontal distribution pattern had reached a random distribution state. The complete mingling index increased substantially across all plots, shifting from a very low mingling state to a high mingling state. This demonstrates that dynamic optimization not only adjusts the spatial relationships among trees but also enhances stand stability and biodiversity at the species level. Notably, in Plot P4, the mingling index increased from 0.0028 to 0.7505, highlighting the strong adaptability of multi-agent deep reinforcement learning in adjusting tree species composition. Additionally, the crown competition index significantly decreased across all plots, indicating a considerable reduction in competition pressure. The stratification index also improved effectively, enhancing the vertical distribution pattern. In contrast, the neighborhood comparison showed minimal decline across all plots, suggesting that the pre-optimization stand already exhibited a relatively stable size differentiation state. As a result, despite some adjustments during optimization, fluctuations in this index remained small.

As shown in Table 8, after incorporating structure prediction for dynamic optimization, the objective function values for all plots experienced significant improvements. Notably, in Plot P4, the objective function value increased from 0.3344 to 0.5863, achieving a remarkable 75.33% increase. Even in Plot P5, where the improvement was relatively smaller, the increase still reached 44.62%. These results indicate that dynamic optimization using multi-agent deep reinforcement learning combined with structure prediction effectively enhances stand structure stability and balance, making it more aligned with an ideal management state.

Table 8

Table 8. Dynamic stand structure optimization under different optimization schemes across various plots.

4 Discussion

To avoid the limitations of relying solely on selective harvesting for optimization, this study proposed a multi-objective stand structure optimization scheme based on multi-agent deep reinforcement learning. A simulation experiment was conducted using sample plot data from Pinus yunnanensis secondary forests in Southwest China, where MADQN was applied for the simulation and compared with a multi-agent reinforcement learning optimization scheme. The results showed that under both optimization algorithms, stand structure indexes improved to varying degrees across all plots. However, compared to MAQL, MADQN consistently achieved higher optimization gains across different stand conditions, demonstrating greater adaptability and stability. These findings indicate that multi-agent deep reinforcement learning can learn more optimal strategies in complex environments and achieve more comprehensive optimization in a shorter time.

Traditional multi-agent reinforcement learning is limited by the dimensionality of the state-action space, especially in complex optimization environments like stand structure optimization. Stand structure features exhibit nonlinearity and continuity, making it difficult for MARL to store Q-values in a tabular format. This leads to high storage overhead and low generalization ability. In contrast, multi-agent deep reinforcement learning stores Q-values in a parametric form using neural networks to extract stand structure features. It also employs strategies such as experience replay and target networks to improve training efficiency and stability (Gronauer and Diepold, 2022) (Shen et al., 2022). This allows the same agents, under selective harvesting and replanting measures, to learn more smoothly and approach a globally optimal optimization strategy, with superior convergence capabilities.

Although multi-agent deep reinforcement learning has significant advantages in optimization efficiency and result accuracy for stand structure optimization, the optimal strategy derived from multi-agent deep reinforcement learning is based solely on the current stand structure. However, stand structure is a dynamic system that changes over time due to factors such as tree growth, mortality, and human intervention. Relying solely on optimizing the current stand condition may not meet the long-term management needs of the forest. Currently, stand structure prediction models can utilize tree factors from the current state, such as diameter at breast height, tree height, crown width, crown length, etc., to predict the future trend of these factors, thereby simulating the natural evolution of the stand (Ali, 2019). In this study, multi-agent deep reinforcement learning was combined with structure prediction, providing dynamic environmental information to the optimization process. This allows the optimization strategy to not only apply to the current stand state but also be dynamically optimized based on the predicted stand evolution trend, enabling the agents to formulate more robust strategies while considering long-term dynamic changes. In the combined optimization process with stand structure prediction, the agents can adjust the spatial configuration of trees in advance, based on the predicted stand evolution information, and allocate growth resources more effectively. This ensures that the optimization effect remains stable over the long term. Since changes in stand structure occur gradually, the optimization strategy can dynamically respond to potential future risks, such as intensified competition or mortality, ensuring that the stand structure remains balanced during the succession process and avoiding structural imbalance issues caused by short-term optimization. This approach is better suited to the developmental needs of the stand over different time scales. On the other hand, integrating stand structure prediction also affects the optimization time and strategy adjustment approach. Without prediction, the agents typically require more iterations to adapt to environmental changes. However, by incorporating the prediction model, the agents can obtain future potential structural changes earlier, reducing unnecessary exploration and improving optimization efficiency. Moreover, the long-term trend information provided by the prediction model allows for more precise optimization strategies, preventing fluctuations in stand structure caused by short-term optimization.

From the optimization results, it can be observed that the key stand structure indicators for all plots significantly improved after optimization, and the stand structure approached the ideal stand condition. This validates the feasibility and effectiveness of the method in long-term forest management, indicating that, when considering dynamic stand changes, the stand structure can maintain a reasonable spatial configuration over extended time scales. Compared to the significant changes in other stand structure indicators, the change in neighborhood comparison was relatively small. On one hand, the degree of size differentiation in the plots was already in a stable growth state before optimization (Xuan et al., 2023). On the other hand, multi-agent deep reinforcement learning primarily relies on selective harvesting and replanting as the main regulatory measures. Therefore, the focus of the optimization was on adjusting aspects such as mingling degree, competition pressure, and distribution pattern, while the direct impact on neighborhood comparison was relatively small. Additionally, the optimization time varied across plots. Plot P4 required a longer optimization time, while Plot P5 took relatively less time. This may be related to the initial stand conditions and the difficulty of optimization. Plot P5 had a more balanced initial stand, with its horizontal distribution pattern already close to the ideal stand distribution. The competition pressure and mingling degree were also higher than in Plot P4, allowing for a quicker convergence to an optimal adjustment strategy, resulting in a shorter optimization time. In contrast, Plot P4 had poor mingling, relatively high competition pressure, and a less favorable horizontal distribution pattern. As a result, the optimization process required more rounds of exploration and adjustment to ensure an optimal outcome. Furthermore, the length of the optimization time may also be influenced by the multi-agent deep reinforcement learning algorithm itself. Different exploration strategies and parameter settings directly affect the optimization efficiency. This conclusion is consistent with that obtained by using multi-agent reinforcement learning to solve forest stand structure optimization (Xuan et al., 2024).

Surely, when introducing the deep reinforcement learning algorithm, this research still has the following limitations and aspects that require further refinement: (1) This study only utilized the basic MADQN algorithm within multi-agent deep reinforcement learning. Further research is needed to explore whether other more advanced algorithms and corresponding improvements could be more effective in solving multi-objective stand structure optimization problems. (2) Due to the limited data coverage of the research plots, some of the tree factor predictions still exhibit inaccuracies. Additionally, the current prediction models have certain limitations in addressing the growth variability of individual trees and complex environmental factors. Therefore, more suitable prediction methods should be selected in the future to improve the accuracy of structure prediction. (3) The current dynamic stand structure optimization primarily focuses on adjusting mingling degree, spatial distribution pattern, and competition pressure, with relatively limited optimization of neighborhood comparison. Future optimization strategies will pay more attention to controlling neighborhood comparison to more precisely optimize the diameter structure of the stand, enhancing overall balance and growth stability. (4) In the spatial configuration of replanting trees, the study currently uses a proportional method to select multiple native species. However, due to the limited number of certain species, the basic model is used to determine tree height, crown width, and other tree factors. Future research will further optimize the spatial configuration strategy for replanting trees, making the species composition adjustment more scientific and rational. Additionally, more accurate models will be introduced to improve the prediction accuracy of fundamental tree factors, thus enhancing the adaptability and long-term stability of the optimization scheme.

5 Conclusion

This study applies multi-agent deep reinforcement learning to the field of stand structure optimization. The objective function is established using complete mingling, uniform angle index, neighborhood comparison, stratification index, and crown competition index, with selective harvesting and replanting measures for coordinated optimization. Comparative simulation experiments with multi-agent reinforcement learning across different plots showed that the objective function values of multi-agent deep reinforcement learning in each plot were 0.5378, 0.5861, 0.5860, 0.5130, and 0.6034, all higher than those of multi-agent reinforcement learning, which were 0.5302, 0.5369, 0.5766, 0.5014, and 0.5906. These results demonstrate the superiority of multi-agent deep reinforcement learning in stand structure optimization. Considering the dynamic nature of stand structure, combining structural prediction with multi-agent deep reinforcement learning enabled the stand structure in each plot to approach the ideal stand structure within 15–40 years, achieving dynamic optimization of stand structure. This approach provides a scientific basis and decision support for the dynamic optimization of stand structure and has broad application prospects.

Data availability statement

The dataset generated and analyzed during this study is not publicly available due to its ongoing use in current research but is available from the corresponding author, Jianming Wang (wangjianming618@163.com), upon reasonable request.

Author contributions

JZ: Conceptualization, Data curation, Formal Analysis, Investigation, Methodology, Writing – original draft. JW: Conceptualization, Data curation, Funding acquisition, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – review & editing. JY: Data curation, Visualization, Writing – review & editing. YC: Software, Validation, Visualization, Writing – review & editing. BW: Validation, Visualization, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This research was funded by the National Natural Science Foundation of China(grant number 32460389), Youth Talent Project of the Revitalization of Yunnan Province Talent Support Program(XDYC-QNRC-2022-0144) and Yunnan Fundamental Research Projects (grant number 202501AT070417).

Acknowledgments

The author extends heartfelt gratitude to all members of the research team for their invaluable support in data collection. Your commitment and efforts were crucial to the successful completion of this research.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Aguirre, O., Hui, G., von Gadow, K., and Jiménez, J. (2003). An analysis of spatial forest structure using neighbourhood-based variables. For. Ecol. Manage. 183, 137–145. doi: 10.1016/S0378-1127(03)00102-6

Crossref Full Text | Google Scholar

Ali, A. (2019). Forest stand structure and functioning: Current knowledge and future challenges. Ecological 707. Indicators 98, 665–677. doi: 10.1016/J.ECOLIND.2018.11.017

Crossref Full Text | Google Scholar

Bohora, S. B. and Cao, Q. V. (2014). Prediction of tree diameter growth using quantile regression and mixed-effects models. For. Ecol. Manage. 319, 62–66. doi: 10.1016/j.foreco.2014.02.006

Crossref Full Text | Google Scholar

Boston, K. and Bettinger, P. (1999). An analysis of monte carlo integer programming, simulated annealing, and tabu search heuristics for solving spatial harvest scheduling problems. For. Sci. 45, 292–301. doi: 10.1093/forestscience/45.2.292

Crossref Full Text | Google Scholar

Chang, G. and Fan, W. (2024). Model for predicting individual tree crown width of natural secondary Betula Platyphylla. J. OF SOUTHWEST FORESTRY Univ. 44, 129–135. doi: 10.11929/j.swfu.202312031

Crossref Full Text | Google Scholar

Chen, C., Zhou, L., Li, X., Zhao, Y., Yu, J., Lv, L., et al. (2023). Optimizing the spatial structure of metasequoia plantation forest based on uav-lidar and backpack-lidar. Remote Sens. 15, 4090. doi: 10.3390/rs15164090

Crossref Full Text | Google Scholar

Chi, P., Zhu, K., Li, J., Ai, W., Huang, J., and Qing, D. (2019). “Dynamic multi-objective optimization model for forest spatial structure with environmental detection mechanism,” in 2019 IEEE 21st International 720 Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). 1635–1642 (IEEE).

Google Scholar

Chunyan, Z. and Jiping, L. (2017). Spatial location and allocation of replanting trees on pure chinese fir plantation based on voronoi diagram and delaunay triangulation. J. Cent. South Univ. Forestry Technol. 37, 1–8. doi: 10.14067/j.cnki.1673-923x.2017.02.001

Crossref Full Text | Google Scholar

Ding, Y. and Zang, R. (2021). Effects of thinning on the demography and functional community structure of a secondary tropical lowland rain forest. J. Environ. Manage. 279, 111805. doi: 10.1016/j.jenvman.2020.111805

PubMed Abstract | Crossref Full Text | Google Scholar

Dong, L., Bettinger, P., and Liu, Z. (2022). Optimizing neighborhood-based stand spatial structure: Four cases of boreal forests. For. Ecol. Manage. 506, 119965. doi: 10.1016/j.foreco.2021.119965

Crossref Full Text | Google Scholar

Dong, L., Wei, H., and Liu, Z. (2020). Optimizing forest spatial structure with neighborhood-based indices: Four case studies from northeast China. Forests 11, 413. doi: 10.3390/f11040413

Crossref Full Text | Google Scholar

Dongsheng, Q., Xiaofang, Z., Jianjun, L., Rui, G., and Qiaoling, D. (2020). Spatial structure optimization of natural forest based on bee colony-particle swarm algorithm. J. System Simulation 32, 371. doi: 10.16182/j.issn1004731x.joss.19-0320

Crossref Full Text | Google Scholar

Fotakis, D. G., Sidiropoulos, E., Myronidis, D., and Ioannou, K. (2012). Spatial genetic algorithm for multi-objective forest planning. For. Policy Econ 21, 12–19. doi: 10.1016/j.forpol.2012.04.002

Crossref Full Text | Google Scholar

Gangying, H., Yanbo, H., and Hai, X. (2005). Quantitative analysis of forest spatial structure. J. Northeast Forestry Univ. 26, 45-48, 60.

Google Scholar

Gangying, H., Yanbo, H., and Zhonghua, Z. (2018). Research progress of structure-based forest management. Lin Ye Ke Xue Yan Jiu 31, 85. doi: 10.13275/j.cnki.lykxyj.2018.01.011

Crossref Full Text | Google Scholar

Gronauer, S. and Diepold, K. (2022). Multi-agent deep reinforcement learning: a survey. Artif. Intell. Rev. 55, 895–943. doi: 10.1007/s10462-021-09996-w

Crossref Full Text | Google Scholar

Gyawali, A., Sharma, R., and Bhandari, S. (2015). Individual tree basal area growth models for chir pine (Pinus roxberghii Sarg.) in western Nepal. 61, 535-543. doi: 10.17221/51/2015-JFS

Crossref Full Text | Google Scholar

Haight, R. G. and Travis, L. E. (1997). Wildlife conservation planning using stochastic optimization and importance sampling. For. Sci. 43, 129–139. doi: 10.1093/forestscience/43.1.129

Crossref Full Text | Google Scholar

Han, M., Li, L., Zheng, W., Su, J., Li, W., Gong, J., et al. (2011). Effects of different intensity of thinning on the improvement of middle-aged yunnan pine stand. J. Cent. South Univ. For. Technol. 31, 27–33.

Google Scholar

Jian, L., Jia, X., Kun-yong, Y., Su-ping, Y., Jin-zhao, Z., and Qiu-yue, Z. (2018). Simulation of replantation of low-density ecological landscape forest with coupled stand structure. Acta Agriculturae Universitatis Jiangxiensis 40, 1125–1133. doi: 10.13836/j.jjau.2018142

Crossref Full Text | Google Scholar

Jianming, W. (2017). Study on Decision Technology of Tending Felling for Larix principis-rupprechtii Plantation Forest. BeiJing Forestry University, Beijing. Ph.D. thesis.

Google Scholar

Jianming, W., Baoguo, W., and Qiyang, L. (2017). Forest thinning subcompartment intelligent selection based on genetic algorithm. SCIENTIA Silvae SINICAE 53, 63–72.

Google Scholar

Jiazheng, L., Xiaona, X., and Huayong, Z. (2021). Study on the growth prediction model of birch species in the mountainous area of northern hebei. J. Inner Mongolia Univ. (Natural Sci. Edition). 52, 257-263. doi: 10.13484/j.nmgdxxbzk.20210306

Crossref Full Text | Google Scholar

Jiping, L., Rui, G., Dongsheng, Q., Jianjun, L., Xiaofang, Z., Kaiwen, Z., et al. (2020). Prediction of stand spatial structure of natural secondary forest based on gm (1,1). J. Cent. South Univ. Forestry Technol. 40, 9–21. doi: 10.14067/j.cnki.1673-923x.2020.01.002

Crossref Full Text | Google Scholar

Kim, J.-H., Roh, M.-I., and Yeo, I.-C. (2025). A method for generating multiple hull forms at once using mlp (multi-layer perceptron). Ocean Eng. 324, 120659. doi: 10.1016/j.oceaneng.2025.120659

Crossref Full Text | Google Scholar

Lee, D., Repola, J., Bianchi, S., Siipilehto, J., Lehtonen, M., Salminen, H., et al. (2024). Calibration models for diameter and height growth of Norway spruce growing in uneven-aged stands in Finland. For. Ecol. Manage. 558, 121783. doi: 10.1016/j.foreco.2024.121783

Crossref Full Text | Google Scholar

Lei, J., Yuanchang, L., Shengxi, L., Kun, L., and Genqian, L. (2008). A study on d iametra l structure of yunnan pine forestin the pla teaus ofm id-yunnan province. For. Res. 21, 126. doi: 10.13275/j.cnki.lykxy.2008.01.025

Crossref Full Text | Google Scholar

Ling-bo, D. and Zhao-gang, L. (2011). Visualization of individual Mongolian scots pines in the plantation conditions based on characteristic parameters of morphological structures. J. OF Beijing FORESTRY Univ. 33, 20–27. doi: 10.13332/j.1000-1522.2011.05.017

Crossref Full Text | Google Scholar

Liu, H., Dong, X., Meng, Y., Gao, T., Mao, L., and Gao, R. (2023). A novel model to evaluate spatial structure in thinned conifer-broadleaved mixed natural forests. J. Forestry Res. 34, 1881–1898. doi: 10.1007/s11676-023-01647-w

Crossref Full Text | Google Scholar

Luo, D. (2021). Stand structure characteristics of betula alnoides natural forest in dehong prefecture, yunnan province. (Master's thesis). Chinese Academy of Forestry, Beijing.

Google Scholar

Ma, Y., Liang, F., Zhu, M., Chen, C., Chen, C., and Lv, X. (2022). Ft-ir combined with pso-cnn algorithm for rapid screening of cervical tumors. Photodiagnosis Photodyn. Ther. 39, 103023. doi: 10.1016/j.pdpdt.2022.103023

PubMed Abstract | Crossref Full Text | Google Scholar

Mengying, H., Lihu, D., and Fengri, L. (2021). Tree crown length prediction models for Larix algensis and Fraxinus mandshurica in mixed plantations with different mixing methods. J. Nanjing Forestry Univesity (Natural Sci. Edition) 45, 13. doi: 10.12302/j.issn.1000-2006.202005043

Crossref Full Text | Google Scholar

Na, L. (2019). The Research on Dynamic multi-objective optimization model of forest structure under CMIP5 model. (Master’s thesis). Central South University of Forestry & Technology, Changsha.

Google Scholar

Nhat-Duc, H. and Van-Duc, T. (2023). Comparison of histogram-based gradient boosting classification machine, random forest, and deep convolutional neural network for pavement raveling severity classification. Automation construction 148, 104767. doi: 10.1016/j.autcon.2023.104767

Crossref Full Text | Google Scholar

Ning, Z., Yang, Y., Wang, X., Song, Q., Guo, L., and Jamalipour, A. (2023). Multi-agent deep reinforcement learning based uav trajectory optimization for differentiated services. IEEE Trans. Mobile Computing 23, 5818–5834. doi: 10.1109/TMC.2023.3312276

Crossref Full Text | Google Scholar

Okasha, N. M. and Frangopol, D. M. (2009). Lifetime-oriented multi-objective optimization of structural maintenance considering system reliability, redundancy and life-cycle cost using ga. Struct. Saf. 31, 460–474. doi: 10.1016/j.strusafe.2009.06.005

Crossref Full Text | Google Scholar

Olsthoorn, A., Bartelink, H., Gardiner, J., Pretzsch, H., Hekhuis, H., and Franc, A. (1999). Management of mixed-species forest: silviculture and economics.

Google Scholar

Packalen, P., Strunk, J., Maltamo, M., and Myllymäki, M. (2023). Circular or square plots in als-based forest inventories—does it matter? Forestry 96, 49–61. doi: 10.1093/forestry/cpac032

Crossref Full Text | Google Scholar

Raptis, D., Kazana, V., Kazaklis, A., and Stamatiou, C. (2018). A crown width-diameter model for natural even-aged black pine forest management. Forests 9, 610. doi: 10.3390/f9100610

Crossref Full Text | Google Scholar

Sánchez-González, M., Cañellas, I., and Montero, G. (2007). Generalized height-diameter and crown diameter prediction models for cork oak forests in Spain. For. Syst. 16, 76–88. doi: 10.5424/srf/2007161-00999

Crossref Full Text | Google Scholar

Sattler, D. F. and LeMay, V. (2011). A system of nonlinear simultaneous equations for crown length and crown radius for the forest dynamics model sortie-nd. Can. J. For. Res. 41, 1567–1576. doi: 10.1139/x11-078

Crossref Full Text | Google Scholar

Shen, R., Zhong, S., Wen, X., An, Q., Zheng, R., Li, Y., et al. (2022). Multi-agent deep reinforcement learning optimization framework for building energy system with renewable energy. Appl. Energy 312, 118724. doi: 10.1016/j.apenergy.2022.118724

Crossref Full Text | Google Scholar

Sheng, Q., Dong, L., Chen, Y., and Liu, Z. (2023). Selection of the optimal timber harvest based on optimizing stand spatial structure of broadleaf mixed forests. Forests 14, 2046. doi: 10.3390/f14102046

Crossref Full Text | Google Scholar

Shuai, X., Jianming, W., Y., J., and Yadong, G. (2024). Characteristics of stand structure of pinus yunnanensis Secondary forests on the east slope of cangshan mountain. J. West China Forestry Sci. 53, 47–54. doi: 10.16473/j.cnki.xblykx1972.2024.01.007

Crossref Full Text | Google Scholar

Siipilehto, J., Sarkkola, S., Nuutinen, Y., and Mehtätalo, L. (2023). Predicting height-diameter relationship in uneven-aged stands in Finland. For. Ecol. Manage. 549, 121486. doi: 10.1016/j.foreco.2023.121486

Crossref Full Text | Google Scholar

Su, J., Li, L., Zheng, W., Yang, W., Han, M., Huang, Z., et al. (2010). Effect of intermediate cutting intensity on growth of Pinus yunnanensis plantation. J. West China Forestry Sci. 39, 27–32.

Google Scholar

Tang, M., Tang, S., Lei, X., and Li, X. (2004). Study on spatial structure optimizing model of stand selection cutting. Sci. Silvae Sin. 40, 25–31. doi: 10.1016/j.jce.2003.10.003

Crossref Full Text | Google Scholar

Von Gadow, K., Hui, G., Chen, B., and Albert, M. (2003). Beziehungen zwischen winkelmaßund baumabständen: Relationship between the winkelmaßand nearest neighbor distances. Forstwissenschaftliches Centralblatt 122, 127–137. doi: 10.1046/j.1439-0337.2003.00127.x

Crossref Full Text | Google Scholar

Wang, T., Dong, L., Liu, Z., Zhang, L., and Chen, Y. (2019). Optimization of replanting space of natural secondary forest in daxing’anling mountains of northeastern China. J. OF Beijing FORESTRY Univ. 41, 127–136. doi: 10.13332/j.1000-1522.20190025

Crossref Full Text | Google Scholar

Waschneck, B., Reichstaller, A., Belzner, L., Altenmüller, T., Bauernhansl, T., Knapp, A., et al. (2018). Optimization of global production scheduling with deep reinforcement learning. Proc. Cirpw 72, 1264–1269. doi: 10.1016/j.procir.2018.03.212

Crossref Full Text | Google Scholar

Wu, Y., Li, J., Huang, J., Qing, D., Ai, W., Chi, P., et al. (2022). Multi-objective optimization model of forest spatial structure based on dynamic multi-group pso algorithm. doi: 10.21203/rs.3.rs-1398671/v1

Crossref Full Text | Google Scholar

Xi, Z., Li-ming, J., Yu, Z., and Cong-hui, Z. (2015). Individual growth simulation for natural secondary forest of Quercus variabilis in qinling area based on fvs. J. OF Beijing FORESTRY Univ. 37, 19–29. doi: 10.13332/j.1000-1522.20140356

Crossref Full Text | Google Scholar

Xiaonan, W., Wenhao, S., and Lingbo, D. (2024). Age estimation model for individual trees in natural Larix gmelinii forest based on random forest model. Chin. J. Appl. Ecol. 35, 1055. doi: 10.13287/j.1001-9332.202404.023

PubMed Abstract | Crossref Full Text | Google Scholar

Xuan, S., Wang, J., and Chen, Y. (2023). Reinforcement learning for stand structure optimization of Pinus yunnanensis secondary forests in southwest China. Forests 14, 2456. doi: 10.3390/f14122456

Crossref Full Text | Google Scholar

Xuan, S., Wang, J., Yin, J., Chen, Y., and Wu, B. (2024). Multi-agent reinforcement learning for stand structure collaborative optimization of Pinus yunnanensis secondary forests. Forests 15, 1143. doi: 10.3390/f15071143

Crossref Full Text | Google Scholar

Yitong, Y. (2019). Study on Forest Structure of Different Recovery Stages and Optimization Models of Natural Mixed Spruce-fir Secondary Forests on Selective Cutting. BeiJing Forestry University, Beijing. Ph.D. thesis.

Google Scholar

Zaizhi, Z. (2001). Status and perspectives on secondary forests in tropical China. J. Trop. For. Sci. 13, 639–651.

Google Scholar

Zhang, G., Hui, G., Zhang, G., Zhao, Z., and Hu, Y. (2019). Telescope method for characterizing the spatial structure of a pine-oak mixed forest in the xiaolong mountains, China. Scandinavian J. For. Res. 34, 751–762. doi: 10.1080/02827581.2019.1680729

Crossref Full Text | Google Scholar

Zhang, G., Hui, G., Zhao, Z., Hu, Y., Wang, H., Liu, W., et al. (2018). Composition of basal area in natural forests based on the uniform angle index. Ecol. Inf. 45, 1–8. doi: 10.1016/j.ecoinf.2018.01.002

Crossref Full Text | Google Scholar

Zhang, Y., Qi, S., Zhang, L., Guo, Y., Zhang, D., Liu, S., et al. (2024). Optimizing Pinus tabuliformis forest spatial structure and function in beijing, China. Forests (19994907) 15, 1963. doi: 10.3390/f15111963

Crossref Full Text | Google Scholar

Zhang, W., Wu, X., Jiang, S., Liu, Y., Weng, G., Shen, Y., et al. (2023). Technical regulation for forestation; technical report gb/t 15776-2023, state administration for market regulation.

Google Scholar

Zhao, J., Wang, J., Yin, J., Chen, Y., and Wu, B. (2024). Optimization of the stand structure in secondary forests of Pinus yunnanensis based on deep reinforcement learning. Forests 15, 2181. doi: 10.3390/f15122181

Crossref Full Text | Google Scholar

Zhou, C., Liu, D., Chen, K., Hu, X., Lei, X., Feng, L., et al. (2022). Spatial structure dynamics and maintenance of a natural mixed forest. Forests 13, 888. doi: 10.3390/f13060888

Crossref Full Text | Google Scholar

Appendix 1

To address stand structure optimization, this study proposes a multi-agent deep reinforcement learning algorithm. The detailed pseudocode is provided below. Due to its length, the decision-making processes of Agent1 and Agent2 in the MADQN framework are presented separately.

Algorithm 1. Overall Process of MADQN for Stand Structure Optimization.

www.frontiersin.org

Algorithm 2. Agent 1: Selective harvesting Process.

www.frontiersin.org

Algorithm 3. Agent 2: Replanting Process.

www.frontiersin.org

APPENDIX.2

Furthermore, to realize dynamic optimization of stand structure, this study integrates multi-agent deep reinforcement learning for stand structure optimization with structure prediction, and presents the corresponding pseudocode.

Algorithm 4. Stand Structure Dynamic Optimization.

www.frontiersin.org

Keywords: multi-agent deep reinforcement learning, stand structure, multi-objective optimization, structure prediction, secondary forests

Citation: Zhao J, Wang J, Yin J, Chen Y and Wu B (2025) Dynamic optimization of stand structure in Pinus yunnanensis secondary forests based on deep reinforcement learning and structural prediction. Front. Plant Sci. 16:1610571. doi: 10.3389/fpls.2025.1610571

Received: 12 April 2025; Accepted: 10 September 2025;
Published: 15 October 2025.

Edited by:

César Marín, Santo Tomás University, Chile

Reviewed by:

Sumit Chakravarty, Uttar Banga Krishi Viswavidyalaya, India
Kai Liu, Northeast Normal University, China

Copyright © 2025 Zhao, Wang, Yin, Chen and Wu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jianming Wang, d2FuZ2ppYW5taW5nNjE4QDE2My5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.