DATA REPORT article
Front. For. Glob. Change
Sec. Forest Soils
Volume 8 - 2025 | doi: 10.3389/ffgc.2025.1615261
Daily soil temperatures at varying depths in Bangladesh during 2001-2022
Provisionally accepted- 1University of Chittagong, Chittagong, Chittagong, Bangladesh
- 2Shandong University, Jinan, China
- 3University of Oxford, Oxford, England, United Kingdom
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Soil temperature is a vital indicator for forest ecosystems under the changing climate. It can affect plant growth directly (through its effect on physiological activity) and indirectly (through its effect on soil nutrient availability), crucial for sustainability development of forest ecosystem (Paul et al., 2004). Moreover, soil temperature is essential for understanding carbon and nutrient cycling in forest ecosystems under warming climate scenarios (Allison et al., 2010) due to the fact that both nighttime and daytime soil temperatures drive germination, blooming, and fruiting in forests (Hood. RC., 2001;Schimel et al., 2014;Hu et al., 2016;Yan et al., 2018). Incorporating soil temperature data into climate and ecosystem models may improve the accuracy of numerical weather forecasts and long-term climate projections (Dirmeyer et al., 2006), so the importance of soil temperature data transcends disciplinary boundaries, addressing pressing environmental, ecological and societal challenges, making it a valuable tool in advancing knowledge across various fields.Bangladesh is a tropical country in South Asia and a transitional point for flora and fauna between the Indo-Himalayan and Indo-Chinese subregions. It has four major forest types: mixed-evergreen forests, deciduous forests, mangrove forests, and freshwater swamp forests. Forest systems in Bangladesh are extremely vulnerable to future climate change (Rahman et al., 2018), especially in coping with monsoon flooding and dry periods.Soil temperature data in Bangladesh can offer novel insight into how changing temperatures impact flora and fauna, helping to design effective ecological conservation strategies, especially for sensitive and endangered species (Khan. et al., 2015). Moreover, accurate soil temperature data in Bangladesh can aid in optimizing water use, reducing water wastage, and preventing over-irrigation in agriculture practices (Brammer H., 2014).Unfortunately, observed soil temperature data in Bangladesh are plagued by various limitations, including missing values, inaccuracies, and irrelevant data points. These issues reduce its accessibility and utility for researchers, policymakers, and the public. At present, soil temperature data in Bangladesh are incomplete as follows: among all 34 meteorological stations from the Bangladesh Meteorological Department (BMD) (Figure 1), soil temperature in only 13 stations were measured regularly at the depth of 10 cm, 30 cm and 50 cm during 2001-2022. These 13 stations are Bogra, Barisal, Comilla, Dhaka, Dinajpur, Faridpur, Khulna, Mymensingh, Rajshahi, Rangamati, Rangpur, Srimongal, Tangail (marked in green in Figure 1). But, observed soil temperature data in the following period at these 13 In this data report, we used ensemble learning techniques to mine strong links among observed meteorological factors and observed soil temperature in 13 stations and then established an optimal soil temperature forecast model. Then, inputting observed data at 34 meteorological stations into this optimal soil temperature model, we can not only produce the soil temperature dataset at the remaining 21 stations without observed soil data, but also fill the missing soil temperature data at 13 stations with a lot of missing soil data.In order to generate soil temperatures at different depth in Bangladesh, we considered the following four ensemble models (Li et al., 2022): Random Forest (RF) aggregates decision trees through bootstrapped sampling, addressing over-fitting, and handling missing values with user-defined parameters. Since RF offers averaged estimations across several aggregations, it fundamentally differs from individual decision trees (Sun et al., 2016). Gradient Boosted Trees (GBT) sequentially combines decision trees, efficiently correcting errors.The difference between GBT and RF is that RF builds all decision trees in parallel and its output is the average of prediction results from all decision trees, while GBT builds decision trees sequentially, and its output is the sum of forecast results from all decision trees (Wu et al., 2022). Hybrid DT-GBT employs voting to blend decision tree and gradient boosting tree for forecast tasks. It calculates the average of forecasts produced from both learners. Hybrid RF-GBT combines the strengths of both RF and GBT, boosting forecast accuracy and stability. It employs stacking and averaging techniques to enhance performance, outperforming individual models and effectively handling diverse data patterns, complex relationships, and unseen data. From January 1, 2001, to December 31, 2022, all meteorological factors in eight input scenarios were measured in 34 meteorological stations of Bangladesh (Figure 1). But soil temperature was only measured in 13 meteorological stations, and many missing values existed. Our method to generate a soil temperature dataset in Bangladesh can be divided into three steps as follows:Step 1. Establishment of optimal soil temperature forecast model. We focused on 13 meteorological stations of Bangladesh where both meteorological factors and soil temperature were measured at the same time. We deleted the time period with missing data and divided the remaining data into two parts: the data covering the period from 1 January 2001 to 31 December 2015 were used for the training dataset, and the data covering the period from 1 January 2016 to 31 December 2022 were used for the testing dataset. We used four ensemble learning models (RF, GBT, hybrid DT-GBT, and hybrid RF-GBT) to establish the soil temperature forecast model. Its output was the forecast of soil temperature at day t, and its input was one of eight input scenarios under a time window with size k (k=1,2,3,4,5). By comparing the forecasting performance on testing dataset, we obtained the optimal combination of ensemble learning model, input scenario and time window as the forecast of soil temperature at different depth in 13 meteorological stations in Bangladesh.Step 2. Filling in missing soil temperature data in 13 meteorological stations. From Step 1, we established the optimal soil forecast model in 13 meteorological stations in Bangladesh whose soil temperature was measured only partly. During the time period where soil temperature was not measured, all meteorological factors were measured. Therefore, we input observed meteorological factors into the optimal soil temperature forecast model and then obtained the estimate of soil temperature during the time period where soil temperature was not measured.Step 3. Generation of soil temperature data in 21 meteorological stations. In all these stations, meteorological factors were measured, but soil temperature was not measured. When two meteorological stations are close together, the link of meteorological factors and soil temperature in these two stations are similar. Therefore, we can generate soil temperature data in 21 stations by the following approach: For any station (e.g. In terms of Pearson correlation coefficient (R) of observed and forecasted soil temperature in 13 meteorological stations, we evaluated the forecast performance of the combination of four ensemble learning model, eight input scenarios and five-day time windows. Due to the limitation of the numbers of figures and tables in the data report, the detailed comparison results among models with different inputs and time windows are not shown in this data report. For 10 cm depth, the highest average R value was 0.9419 and the lowest RMSE value was 1.597, achieved by GBT with the 8th input scenario and five-day time window. For 30 cm depth, the highest average R value was 0.9606 and the lowest RMSE value was 1.1959, achieved by the hybrid RF-GBT model with the 8th input scenario and a five-day time window. Similarly, for 50 cm depth, the highest average R value (1.9005) and lowest RMSE value (0.9397) was achieved by the same model and input conditions. These combination of model and input scenario, which achieved the highest R value and the lowest RMSE value, were used to generate the daily soil temperature dataset in Bangladesh.Step 2 above, we filled in missing daily soil temperature data at 10 cm, 30 cm, and 50 cm in 13 We applied ensemble learning techniques to establish a daily soil temperature dataset at depths of 10 cm, 30 cm and 50 cm across all 34 meteorological stations of Bangladesh. Our dataset has high accuracy, where Pearson correlation coefficient (R) of observed and forecasted soil temperature could reach over 0.96. The use of our soil temperature datasets in Bangladesh allow for a holistic understanding of soil temperature evolution patterns across different regions of Bangladesh. These insights will be crucial for addressing forest conservation and climate resilience challenges, making it valuable for research and informed decision-making in Bangladesh.
Keywords: forest soil, soil temperature, Varying depth, Climate Change, Bangladesh
Received: 21 Apr 2025; Accepted: 16 May 2025.
Copyright: © 2025 Das, Zhang, Crabbe and ALAM. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Zhihua Zhang, Shandong University, Jinan, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.