ORIGINAL RESEARCH article
Front. Environ. Sci.
Sec. Environmental Informatics and Remote Sensing
Volume 13 - 2025 | doi: 10.3389/fenvs.2025.1599320
Comprehensive data aleatory uncertainty propagation in regression random forest using a Monte Carlo approach: a struggle with incomplete data provision using a case study on probabilistic soil moisture regionalization
Provisionally accepted- Department of Monitoring and Exploration Technologies, Helmholtz Centre for Environmental Research, Helmholtz Association of German Research Centres (HZ), Leipzig, Germany
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Data uncertainty never decreases along processing chains and should always be reported alongside processing results. In this study, we attempt to propagate aleatory data uncertainty through a multiple regression analysis to generate regionalized probabilistic soil moisture maps. We employ a nonparametric solution for multiple regression by means of random forests within a Monte Carlo framework. Our input data comprise sparse soil moisture data and spatially dense auxiliary soil and topographic maps, which serve as response and predictor variables in our regression model, respectively. While the methodology is technically straightforward, challenges arise due to incomplete communication of data uncertainty by data providers. This results in knowledge gaps that must be filled by subjective assumptions rather than data-driven insights. We highlight the issues that hinder straightforward uncertainty propagation, ultimately making our final uncertainty quantification of regionalized soil maps an optimistic estimate. Additionally, we sketch how existing uncertainty classification schemes could help data providers deliver quantified uncertainties with their data, enabling users to more accurately assess and report uncertainties in their derived data products. products provide a broader overview of surface SM conditions but usually with coarse spatial resolution. In recent years, cosmic-ray neutron sensing (CRNS) has emerged as an effective method for observing root-zone SM at the catchment scale, encompassing study areas of several hundred square kilometers. CRNS measurements are sparse, integrating SM over a radius of approximately 200 meters with penetration depths reaching several decimeters (e.g., Desilets et al., 2010;Köhli et al., 2015). When mounted on mobile vehicles operating along roads or rails, CRNS can provide valuable SM data along the travel routes (e.g.,
Keywords: uncertainty propagation, Probabilistic prediction, monte carlo, regression random forest, soil moisture, Aleatory uncertainty, uncertainty quantification, Cosmic-ray neutron sensing
Received: 24 Mar 2025; Accepted: 11 Jul 2025.
Copyright: © 2025 Paasche, Dega, Schrön and Dietrich. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Hendrik Paasche, Department of Monitoring and Exploration Technologies, Helmholtz Centre for Environmental Research, Helmholtz Association of German Research Centres (HZ), Leipzig, Germany
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.