Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Remote Sens., 26 January 2026

Sec. Microwave Remote Sensing

Volume 6 - 2025 | https://doi.org/10.3389/frsen.2025.1728399

On demand machine learning-driven surface freeze-thaw retrieval across Canadian agricultural regions using Sentinel-1 SAR data

  • 1Department of Environmental Sciences, University of Québec at Trois-Rivières, Trois-Rivieres, QC, Canada
  • 2Center for Northern Studies (CEN), Québec City, QC, Canada
  • 3Research Centre for Watershed-Aquatic Ecosystem Interactions (RIVE), University of Québec at Trois-Rivières, Trois-Rivières, QC, Canada

This study explores the prediction of freeze-thaw (FT) states across agricultural fields in four Canadian provinces-Alberta, Manitoba, Saskatchewan, and Québec-using Random Forest classification and regression models. Soil temperature data at a 5 cm depth were gathered from 174 agricultural weather stations from 2016 to 2023. Sentinel-1 SAR VH radar backscatter indicators were processed using Google Earth Engine (GEE). Two modeling approaches were evaluated: a classification model trained on in situ data, where soil states were rigidly classified as either frozen or thawed, and a regression model trained against in situ soil freezing probabilities. Additionally, other site-specific ancillary variables such as latitude, altitude, crop type, and soil type were tested as potential predictors. The regression model using the Exponential Freeze-Thaw Algorithm (EFTA) derived from VH radar backscatter (VHEFTA) demonstrated strong discrimination between frozen and thawed states, and emerged as the most influential factor, accounting for over 90% of the model’s predictive ability. Models using VHEFTA alone achieved up to 81.4% accuracy for classifying FT state, with only a slight improvement to 82.1% when combined with other predictors. Spatial and temporal validation showed stable accuracy (0.79–0.83) and F1-scores (0.75–0.88) across regions and years. Evaluation of model sensitivity to seasonal and temperature variability revealed that although uncertainties were not fully eliminated during transitional periods for both binary and probability-based FT models, binary-based models consistently showed lower error rates and stronger performance. The final FT model was implemented within an interactive web-based tool that generates on-demand FT maps for user-supplied regions of interest across Canadian agricultural and open-land areas.

1 Introduction

More than half of the world’s land experiences seasonal freezing and thawing (Kim et al., 2010), which significantly impacts hydrological processes (Cade-Menun et al., 2013; Gray et al., 1985), agricultural activities (Osei et al., 2024), and biogeochemical cycles by altering surface energy balances (Schuur et al., 2015; Black et al., 2000), vegetation growth (Beer et al., 2007), and soil moisture (Wei et al., 2019). Soil freeze-thaw (FT) cycles occur at high latitudes (Moradi et al., 2023; Schuur et al., 2022) and high elevations (Peng et al., 2016; Ekici et al., 2015), and play a crucial role in regulating the water cycle and carbon exchange, particularly in seasonally frozen ground and permafrost regions. From an agricultural perspective, FT cycles play a crucial role in crop growth, soil moisture availability, and the vertical redistribution of organic matter within farmland soil (Sun et al., 2021; Hou et al., 2020). From a hydrological perspective, regional and local hydrological processes have been affected by the reduction and eventual disappearance of the freezing period in seasonally frozen regions, as well as by the accelerated melting of ice and snow (Ireson et al., 2013; Niu and Yang, 2006). These changes further emphasize the importance of understanding FT cycles in the context of climate change. Accurate characterization of soil FT transitions and freezing duration in cold regions can help to understand how precipitations are partitioned between surface runoff and infiltration in soils, thereby helping to better predict catchment-scale hydrology.

Station-based measurements with data loggers offer high-resolution temporal data, making them valuable for monitoring near-surface FT states. However, their sparse distribution limits their spatial coverage, rendering them inadequate for assessing long-term FT cycle impacts across larger landscapes (Gao et al., 2018; Wu et al., 2020; Shao and Zhang, 2020). In contrast, remote sensing provides an effective alternative for spatiotemporal monitoring of soil surface FT status, enabling coverage across broader areas with consistent observations. Space-borne active and passive remote sensing observations operating at microwave frequencies have been used for the continuous monitoring of the soil FT state over large, extensive areas, regardless of cloud cover and seasonal daylight changes (Karthikeyan et al., 2017).

Active microwave techniques, particularly those utilizing synthetic aperture radar (SAR), provide the higher spatial resolution needed to accurately capture FT variability in heterogeneous landscapes (Cohen et al., 2024). As part of Copernicus, Sentinel-1’s C-band SAR instruments are particularly well suited to monitor soil FT states at local to regional scales due to their high spatial resolution (5 × 20 m in IW mode) and frequent revisit times (approximately every 6 days), making them ideal for complex environments such as agro-forested mosaic landscapes.

Several studies have leveraged SAR data to detect seasonal near-surface FT conditions. For example, Baghdadi et al. (2018) demonstrated that both VV and VH polarizations from Sentinel-1 are sensitive to frozen soils, with VH showing greater responsiveness to near-surface freezing (<5 cm). Building on this foundation, recent research in southern Québec introduced the Exponential Freeze-Thaw Algorithm (EFTA) to enhance FT detection in agro-forested landscapes, particularly when applied to VH backscatter signals (Taghipourjavi et al., 2024b).

Although C-band SAR is highly responsive to near-surface FT conditions through associated changes in soil dielectric constant, snow cover can also modify radar backscatter in ways that complicate the interpretation of FT signals (Cohen et al., 2024; McNairn and Brisco, 2004; Cohen et al., 2024). Wet snow, characterized by a markedly higher dielectric constant due to its liquid-water content, can substantially attenuate or mute the radar return, often producing backscatter values that resemble those of frozen soil surfaces (Lund et al., 2022; Benninga et al., 2019). For example, in agricultural fields in North Dakota, the onset of wet snow during spring melt led to a clear decrease in VH backscatter, making it challenging to distinguish wet snow from frozen ground solely based on backscatter signals (Manickam and Barros, 2020). In open agricultural fields of southern Quebec, C-band SAR backscatter was found to decrease by up to 1 dB on snow-covered days when the soil was frozen, with the reduction becoming more pronounced as snow depth and snow water equivalent increased, although the total effect for shallow, dry snowpacks (up to about 20 cm snow water equivalent) remained below the system’s measurement uncertainty of approximately 1 dB (Bernier and Fortin, 1998). These effects, driven by differences in dielectric behavior and snowpack structure, illustrate that both wet-snow conditions and variations in dry-snow depth can potentially impact C-band backscatter independently of soil FT processes.

To further improve the precision of FT state detection, integrating machine learning approaches offers a promising avenue for leveraging large datasets and uncovering complex spatial and temporal patterns at regional scales (Mugunthan et al., 2023). Despite limitations in the quantity and quality of ground truth data used for training, machine learning (ML) approaches can provide robust predictions of FT status with minimal bias, enabling more accurate and reliable FT monitoring. Among these methods, the Random Forest (RF) algorithm, a widely used shallow learning technique, has proven to be highly effective in various studies. For instance, Zhong et al. (2022) showed that the RF algorithm significantly improved the accuracy of FT onset detection in Alaska. Similarly, Donahue et al. (2023) developed a continuous daily classification record of surface (0–5 cm depth) soil FT dynamics across all Northern Hemisphere land areas using RF models informed by multifrequency brightness temperature (TB) observations from passive microwave sensors, such as AMSR2 and SMAP.

Given the reliance of ML models on observations for training, it is also important to examine the broader context of how soil FT states can be derived from soil temperature data. Although various studies have employed rigid classification methods to model surface soil freezing, categorizing soil distinctly as either frozen or thawed, these methods often fail to capture the inherent uncertainties and spatial variability in FT states, which are crucial for accurate predictions (Donahue et al., 2023; McColl et al., 2016). In contrast, incorporating soil freezing probability into a probabilistic FT modeling approach would provide a more nuanced understanding of the transitions between frozen and thawed states, as well as the associated uncertainties and variability in soil freezing conditions (Walker et al., 2022). While previous study using Sentinel-1 C-band data at two agro-forested sites in southern Québec have demonstrated the potential of SAR observations for local FT monitoring (Taghipourjavi et al., 2024b), these efforts remain spatially and temporally limited and do not capture the broader heterogeneity of Canadian agricultural landscapes. In practice, FT variability across croplands, pastures, and open lands is strongly shaped by differences in soil types, crop cover, and snow depth and density, as well as regional climate gradients. However, none of these variables have been systematically evaluated within a unified SAR-based framework. Consequently, there is a need for remote sensing-based soil FT retrieval methods that generalize beyond site-specific conditions and work reliably across diverse environments. This study addresses that need by scaling FT detection to the Canadian agricultural domain through the integration of Sentinel-1 SAR data with large-sample machine-learning analyses, enabling a consistent and operational framework for monitoring FT detections across Canada’s major agricultural regions.

Accordingly, this research aims to evaluate the large-scale capabilities of SAR-based indicators, particularly the EFTA derived from VH backscatter (VHEFTA) for detecting and predicting FT states in agricultural fields across Canada. We begin by investigating the potential drivers of radar backscatter, including soil FT state, soil and crop type, as well as the possibly confounding effects of snow depth and snow wetness, in order to identify the dominant drivers of radar backscatter. Then, two modeling approaches are implemented to predict soil FT: a RF classification model trained on binary in situ FT labels (frozen vs. thawed), and a RF regression model trained on continuous soil freezing probabilities, both derived from an extensive network of 174 agricultural weather stations across four Canadian provinces (2016–2023). The study also assesses the contribution of different radar polarization configurations (VH, VV, and VH/VV) to FT detection performance and applies EFTA to each to enhance signal interpretability during transitional periods. In addition, various combinations of site-specific predictors including soil type, crop type, altitude, and latitude are integrated to evaluate their added value in improving FT classification and probabilistic prediction of the near-surface FT state. The research further explores model sensitivity to seasonal and temperature variability to better understand classification and prediction accuracy during transitional periods, when radar signal fluctuations are more pronounced. We then implement the best FT model within an interactive web-based tool that allows users to generate on-demand FT maps for any regions of interest across Canadian agricultural and open-land areas.

2 Materials and methods

2.1 Study area and data

The study area spans across four Canadian provinces: Alberta, Manitoba, Saskatchewan, and Québec, each characterized by diverse climatic, soil, and agricultural conditions. The diversity of these regions provides a comprehensive landscape for examining FT states. This study utilized publicly available soil temperature data from 174 agricultural weather stations, ensuring broad spatial coverage. These provinces cover various climatic zones (See Supplementary Figure S1), crop types, and soil conditions, allowing for a broad analysis of soil freezing and thawing events. The spatial distribution of stations is illustrated in Figure 1, showing that they are primarily located in agricultural regions across each province in Canada. Stations are grouped by region, with distinct colors and shapes indicating differences in observational periods. Map views (Figures 1A–E) provide detailed representations of station locations and IDs for Manitoba, Alberta, Saskatchewan, St-Marthe (Québec), and St-Maurice (Québec), respectively. As illustrated by the spatial distribution of observational stations in Figure 1, the coverage is uneven across provinces, reflecting the density of agricultural monitoring networks rather than a uniform sampling design. Stations are concentrated in intensively cultivated regions of the Prairies and southern Québec, with fewer sites in sparsely farmed areas. Consequently, the dataset predominantly represents croplands, pastures/grasslands, and open agricultural lands, consistent with the agricultural focus of this study.

Figure 1
Map illustrating various regions in Canada with different land cover types. Panels A to E highlight specific areas with colored dots and land cover classifications, including cropland, forests, wetlands, and urban areas. The central map shows a broader geographic context, marking locations in Manitoba, Alberta, Saskatchewan, and Quebec. Each area features numbered stations or dots to represent specific observation sites. A legend at the bottom explains the color codes for different land cover types. The map is detailed with geographic coordinates and scale bars for distance reference.

Figure 1. Geographic distribution of agricultural weather stations with soil temperature measurements across five Canadian provinces, overlaid on Canada’s 2020 land cover map (source: https://open.canada.ca/data/en/dataset/ee1580ab-a23d-4f86-a09b-79763677eb47, accessed on 15 October 2025). Station groups are color-coded by province or region to indicate differences in observational periods (see legend, lower left). Subplots show the spatial distribution and assigned station IDs for each region. (A) Manitoba; (B) Alberta; (C) Saskatchewan; (D) St-Marthe (Québec); and (E) St-Maurice (Québec).

2.2 Soil temperature data

Table 1 presents a summary of weather stations with available soil temperature data at a depth of 5 cm across the four studied Canadian provinces. In Manitoba, the station network is well distributed across its southern agricultural fields. Soil temperature data is collected by the Manitoba Agriculture Weather Program (MAWP), which maintains a comprehensive database of historical meteorological information (https://www.gov.mb.ca/agriculture/weather/manitoba-ag-weather.html, accessed 05 December 2025). A total of 113 weather stations actively recorded hourly soil temperature from October 2016 to June 2023 (Figure 1A). 24 weather stations were available in Alberta (Figure 1B), managed by the Alberta Climate Information Service under the Ministry of Agriculture and Irrigation. These stations recorded 5 cm-depht hourly soil temperature from October 2017 to April 2023. The stations are relatively well distributed across the province, albeit more sparsely, and with more stations in southern farmlands.

Table 1
www.frontiersin.org

Table 1. Overview of weather stations for soil temperature measurements across four Canadian provinces.

Saskatchewan has a network of 23 weather stations (Figure 1C), primarily concentrated in the southern part of the province, specifically within the mixed grassland landscapes of the Kenaston Network, an experimental farm located approximately 80–100 km south of Saskatoon. This network spans the Brightwater Creek basin, a representative prairie agricultural region within Canada’s Prairies/Boreal Plains Ecozones (Tetlock et al., 2019). Established in 2011 by Agriculture and Agri-Food Canada (AAFC), the network complements an existing network managed by Environment Canada (EC) and the University of Guelph (Tetlock et al., 2019). The EC-Guelph network supports various research efforts, including hydrological studies, land surface modeling, and satellite data validation. The AAFC sites, located on pasture lands, complement EC-Guelph sites situated on annual cropland. The data for these stations was recorded at 30-min intervals from October 2016 to May 2020.

In Québec, while more than 60 stations provide multi-year soil temperature data at 10 cm depth, these were excluded because Sentinel-1 SAR C-band primarily reflects conditions in the upper few centimeters of soil. Instead, soil temperature data at 5 cm depth was obtained from in situ measurements collected during recent fieldwork (Taghipourjavi et al., 2024a). These measurements span two consecutive years (2020–21 and 2021–22) and were gathered from agricultural plots instrumented in St-Marthe (Figure 1D) and St-Maurice (Figure 1E) sites.

2.3 Snowpack variables

As snowpack conditions were not available at most of the stations, snow depth (SD) and snowpack temperature (Ts) were acquired from the ERA5-Land reanalysis (Muñoz-Sabater et al., 2021) from October 2016 to April 2023 (See Supplementary Figure S2). A snow-wetness indicator was derived by identifying periods when SD > 0 and Ts > 0 °C, corresponding to conditions under which liquid water is expected within the snowpack. With its ∼10 km spatial resolution, ERA5-Land does not capture field-scale variability at individual stations, but its physically based snow energy-balance scheme and consistent temporal coverage offer a useful representation of regional snow conditions across the four provinces included in this study. The mean monthly snow depths across all sites (See Supplementary Figure S2) range from approximately 1–8 cm in October, 6–17 cm in November, and 13–29 cm in December. Peak snow depths are observed in January and February, reaching 20–32 cm in the prairies and up to 43 cm in Québec. The snowpack gradually decreases through March and typically melts by April, with mean depths falling below 10 cm across all provinces. These values indicate that agricultural landscapes in the study area generally experience shallow to moderatly-thick snowpacks, conditions under which C-band backscatter is expected to remain predominantly sensitive to near-surface soil FT changes (Moradi et al., 2024; Fayad et al., 2020). Since radar backscatter is sensitive to both surface roughness and water content, which vary spatially and temporally across different soil and crop types, these variables were considered as potential additional predictors of soil freeze-thaw. Crop type data were gathered from Agriculture and Agri-Food Canada (AAFC) annual crop inventory data from 2016 to 2022 (https://open.canada.ca/data/en/dataset/ba2645d5-4458-414d-b196-6303ac06c1c9, accessed 05 December 2025), while soil type data were primarily sourced from the Detailed Soil Survey (DSS) compilations from the National Soil Database (NSDB) (2016–2023) (https://sis.agr.gc.ca/cansis/nsdb/dss/v3/index.html, accessed 05 December 2025). However, the DSS data are based on general soil information for large polygons, where each polygon may contain multiple soil components. These components are not explicitly mapped, and only the proportion of the area covered by each soil type is provided. As a result, the DSS dataset lacks the precision needed for site-specific mapping of soil characteristics at weather stations, particularly at exact UTM points. To overcome this limitation, soil texture data from the World Soil Information Service (WoSIS) (https://soilgrids.org/, accessed 05 December 2025) were incorporated. Soil texture data from WoSIS, based on the exact locations of the installed stations, were used to assign relevant soil properties to each weather station.

The resulting distribution of soil types across all sites (Supplementary Figure S3A) shows that clay loam is the dominant soil type in both Alberta (46.0%) and Manitoba (51.6%), followed closely by loam in both provinces (45.1% in Alberta and 17.8% in Manitoba). In Québec, clay is the predominant soil type (42.9%), followed by silty clay (35.7%). Regarding crop types, pasture and grassland are the primary crops in Alberta (31.3%) and Manitoba (22.0%), respectively, while corn is the most prevalent crop in Québec (35.7%), and spring wheat is dominant in Saskatchewan (32.2%) (Supplementary Figure S3B).

In addition to soil and crop types, which directly influence radar backscatter through their effects on soil moisture and surface roughness, altitude and latitude were included as potential site predictor variables. Altitude affects local climatic conditions such as temperature, precipitation, and soil moisture, which in turn influence near-surface soil freezing heterogeneity (Johnston et al., 2021), while latitude governs broader climatic gradients that shape the extent, depth, and persistence of seasonal soil freezing (Chang et al., 2022).

2.4 Sentinel-1 SAR imagery

We utilized Sentinel-1 satellite data, which are part of the Copernicus Sentinel constellation and are equipped with C-band SAR sensors operating at a frequency of 5.405 GHz. These sensors operate in Interferometric Wide swath (IW) mode, offering a single-looked backscatter resolution ranging from 5 m to 20 m at incidence angles between 29° and 46° over flat terrain, and covering a swath width of 250 km. In Google Earth Engine (GEE), this data is resampled to a 10–m grid to simplify analysis and integration with other datasets, providing both cross-polarization (VH) and co-polarization (VV) backscatter, which offers valuable insights into surface characteristics (Gorelick et al., 2017).

Preprocessing in GEE involved several steps to ensure accurate and consistent backscatter measurements across the study area. These included the application of orbit files for precise geolocation, removal of thermal and border noise, and radiometric calibration. To account for terrain-induced distortions, terrain correction was performed using high-resolution DEMs. Speckle noise, which degrades SAR image quality, was mitigated using the Refined Lee filter (Yu et al., 2018). In addition, local incidence angle (LIA) normalization was applied following Schaufler’s method, which adjusts σ° values to a reference angle of 40° based on a linear regression with LIA (Schaufler et al., 2018; Taghipourjavi et al., 2024b). To maximize data coverage for FT modeling, both ascending and descending Sentinel-1 passes were used, although ascending acquisitions were more frequent across the study area.

2.5 Exponential freeze-thaw algorithm (EFTA)

The exponential freeze-thaw algorithm (EFTA, Equation 1) was applied using the VH and VV backscatter indicators, given their demonstrated sensitivity, particularly VH polarization, to changes in surface permittivity relevant to FT prediction (Baghdadi et al., 2018; Taghipourjavi et al., 2024b).

EFTA=eK1+σTσtΔθt(1)

Where the expression eK1+σTσt denotes an exponential decay function, ensuring that the outcome remains within the range of 0–1. The binary parameter K represents the expected thaw (K=1) and frozen (K=0) periods, respectively defined based on the most negative and positive differences between radar signals at time t and at the preceding time (t1). Specifically, K is set to 0 between the largest drop in backscatter before February and the largest rise in backscatter after February, reflecting anticipated frozen periods, and is set to 1 otherwise. The term Δθt is defined as Δθt=σTσt, where σt refers to the backscattering coefficient acquired at time t, and σT is a reference backscattering coefficient for thawed conditions. The algorithm implements an exponential decay function that mitigates the effect of backscatter fluctuations during the anticipated thaw periods in the shoulder seasons, while amplifying them during the expected frozen periods.

The physical basis of the EFTA approach derives from the strong dielectric contrast between thawed and frozen soil. Liquid water has a high dielectric constant (∼80), whereas ice exhibits much lower permittivity (∼3–4), producing a pronounced reduction in surface and volume scattering as soils freeze (Mavrovic et al., 2021; Wu et al., 2022). At C-band, VH polarization is particularly responsive to these changes because its cross-polarized signal captures scattering from small-scale roughness elements, soil–vegetation interfaces, and dielectric heterogeneity, all of which diminish as the water content decreases and the soil freezes (Jagdhuber et al., 2014; Baghdadi et al., 2018). Conversely, VV polarization is dominated by coherent surface scattering and is therefore less sensitive to these structural and dielectric transitions (Baghdadi et al., 2018). As a result, VH backscatter exhibits a clearer and more consistent decay between thawed and frozen states, providing a more stable basis for the exponential decay formulation used in EFTA.

2.6 Soil freezing indicators

The 5 cm-depth soil temperature data from weather stations (Table 1) were converted to FT status using two different approaches. The first is a rigid classification, where negative soil temperatures (T ≤ 0 °C) are classified as frozen and thawed otherwise (T > 0 °C). This binary classification ignores the uncertainties introduced by temperature sensor accuracy and the spatial heterogeneity in soil temperatures within the satellite footprint, which can confound the detection of freezing conditions. For example, temperature data loggers typically have accuracies of ±0.5 °C, and small-scale spatial variability (e.g., within a typical 10 × 10 m S1 pixel) can contribute additional uncertainty of up to ±0.5 °C, as observed in St-Marthe and St-Maurice in the Québec Province (Taghipourjavi et al., 2024b). These uncertainties are particularly significant during zero-curtain periods, when soil temperatures hover around 0 °C, typically under deep snowpacks or during seasonal transitions (Outcalt et al., 1990). To address this, a probabilistic approach was applied to predict the probability of soil freezing, P (T ≤ 0 °C), at each measuring station. The standard normal distribution (z), a continuous probability distribution (Wackerly, 2008), was used to parameterize the uncertainty in soil temperature measurements (Equation 2):

PT0=1Pz<Tμσ(2)

where PT0 represents the cumulative probability that the measured soil temperature (T) is less than or equal to zero (the frozen state). The mean μ is set to zero, corresponding to the freezing point, and σ was set to 0.25 °C, assuming a typical T sensor uncertainty of ±0.5 °C corresponding to a two-sigma (±2σ) range. The calculated z score represents the standardized value of the measured soil temperature relative to the freezing point, and P is its cumulative probability distribution. The classified binary FT states and the calculated soil freezing probabilities were extracted for the times matching the Sentinel-1 overpass, to be used for further analysis and model development.

2.7 Random Forest modelling

Random Forest (RF) is a robust ensemble learning method known for its resistance to overfitting and its ability to handle predictors with varying value ranges without requiring data normalization (Breiman, 2001). In addition, the interactions between radar backscatter, soil thermal state, crop types, soil texture, and geographical variables are inherently nonlinear, and RF models are able to capture such complexities without assuming linearity or requiring explicit functional forms. Its bootstrap aggregation approach improves model stability and reduces sensitivity to noise, which is essential given the diverse observational conditions across stations (Fox et al., 2017). RF is also advantageous because it naturally accommodates both categorical and continuous predictors, allowing seamless integration of variables such as soil and crop type (categorical), binary soil-freezing state (categorical), SAR-derived indicators (continuous), and probabilistic soil-freezing values (continuous) within a single modeling framework. Before developing the FT prediction models, we first explored the key drivers of radar backscatter using an exploratory Random Forest regression with VHEFTA as the response variable and soil freezing probability, snow depth, wet-snow conditions, soil type, and crop type as predictors. These predictors represent all know potential physical drivers of radar backscatter, and this exploratory model aimed at identifying which of these contribute most to backscatter variations. Our hypothesis is that soil FT status should be the key driver and guide further model development to predict soil FT state. Three model configurations were evaluated to quantify the relative explanatory contribution of these predictors: (i) a full model including all predictors, (ii) a reduced model including only soil freezing probability and snow depth, and (iii) a baseline model using soil freezing probability alone. This analysis focused on the snow-related predictors to determine whether variations in snow depth or wet-snow conditions introduce measurable effects on VHEFTA relative to the dominant influence of soil FT conditions. The outcome of this step provides a physically grounded basis for interpreting the FT classification and regression models described below.

An RF classification model was then trained on observed binary FT data, where the target variable was already classified as either frozen (1) or thawed (0) based on soil temperature measurements. The modelling framework examined a variety of indicator combinations, including raw Sentinel-1 backscatter signals (VV, VH), their ratio (VH/VV), and corresponding EFTA-derived metrics (VVEFTA, VHEFTA, and VHEFTA/VVEFTA). These indicators were used alone or in combination with static predictors such as latitude, altitude, crop type, and soil type to evaluate their possible contribution to FT classification performance. Since this model was trained directly on discrete FT states, it performed a standard binary classification task optimized to recognize patterns in categorical FT data. The classification performance was evaluated using Overall Accuracy (OA), F1-score, and the area under the Receiver Operating Characteristic curve (ROC-AUC). OA measures the proportion of correctly classified samples relative to the total number of samples. ROC-AUC (Equation 3) quantifies the area under the ROC curve, which illustrates the trade-off between the true positive rate (Equation 4) and the false positive rate (Equation 5) across classification thresholds.

AUC=01TPRFPRdFPR(3)
TruePositiveRateTPR:TPTP+FN(4)
FalsePositiveRateFPR:FPFP+TN(5)

Where TP (true positives) refers to correctly predicted positive cases, TN (true negatives) to correctly predicted negative cases, FP (false positives) to negative cases incorrectly predicted as positive, and FN (false negatives) to positive cases incorrectly predicted as negative. ROC-AUC was selected for its robustness to class imbalance, as demonstrated by Richardson et al. (2024), who showed that when the distribution of predicted classification scores remains unchanged, ROC-AUC remains stable despite variations in class proportions.

A second model based on RF regression was trained on continuous soil freezing probability data to predict FT probabilities, providing a more nuanced representation of FT conditions, particularly during transitional periods when soil temperatures fluctuate around 0 °C. The regression model was trained using the same predictors as the classification models. Unlike classification models that output discrete frozen/thawed states, the regression model predicted continuous probability values, representing the probability of a site being frozen. The regression model was evaluated using R2 (coefficient of determination) to measure how well predicted freezing probabilities aligned with observed freezing probabilities and RMSE (Root Mean Squared Error) to quantify the magnitude of prediction errors.

To ensure comparability between the regression and classification models, the continuous outputs of the probability-based regression model were binarized using a fixed threshold of 0.5, predicted probabilities ≥0.5 were classified as frozen (1), and those <0.5 as thawed (0). The chosen threshold represents a balanced decision boundary between the two frozen and thawed states, corresponding to equal classification probabilities. This binarization allowed us to evaluate the regression model’s ability to classify FT states by directly comparing its thresholded predictions to observed binary FT labels using standard classification metrics. Through this approach, we assessed whether the regression model, originally trained on continuous FT probabilities, could serve as an effective alternative to models explicitly trained on binary FT data.

To optimize model performance and limit overfitting, key RF hyperparameters were tuned using a GridSearchCV approach from Scikit-learn with 5-fold cross-validation. The cv = 5 configuration provided a balanced trade-off between training data size and validation robustness while maintaining a manageable computational cost. Stratified splitting was applied within the cross-validation procedure to preserve the proportion of frozen and thawed samples in each fold. During model training, the Out-of-Bag (OOB) approach was used to provide an internal and unbiased estimate of model accuracy, as approximately one-third of the samples are excluded from the bootstrap sample used to train each tree. In addition, an independent stratified split of the dataset was performed, with 70% of the samples used for training and 30% reserved for testing, to evaluate overall model performance.

2.8 Feature importance

Feature importance analysis was conducted to assess each variable’s contribution to the model’s performance by calculating the average reduction in impurity across all trees in the ensemble (Saarela and Jauhiainen, 2021; Aldrich, 2020). Partial Dependence Plots (PDPs) were used to further understand how individual features influence the model predictions. PDPs illustrate how varying a single feature affects the model’s average prediction, revealing the marginal relationship between each feature and the target variable (Goldstein et al., 2015). As such, PDPs provide insights into the key variables influencing the prediction of FT conditions.

2.9 Spatial and temporal model validation

A spatial K-fold cross-validation approach was employed to evaluate the model’s ability to generalize across different sites and accurately classify FT states. Given the uneven distribution of stations, a leave-one-station-out cross-validation approach was employed to evaluate the model’s spatial transferability. In each fold, all the data from a single station was excluded from calibration and used as the test set, while the model was trained on the remaining stations.

Temporal cross-validation was conducted to evaluate the model’s temporal transferability. The data was split chronologically by year, and the model was trained on all years except one, which was used for validation. This leave-one-year-out approach ensures that the model is tested on unseen data from different sub-time periods within the period of data availability (October to June from 2016 to 2023). All hyperparameters were kept constant throughout the process, with no retuning between folds. Data processing, statistical analysis, and modeling were conducted using Python programming in Google Colab.

3 Results

3.1 Diagnostic assessment of controls on VHEFTA variability

The full exploratory Random Forest regression model explained 47% (OOB R2 ≈ 0.47; RMSE ≈2.43 dB) of the variability in VHEFTA (Table 2). The feature-importance analysis shows that soil freezing probability is by far the dominant predictor, contributing approximately half of the total importance, with snow depth as second most important predictor (Figure 2A). The remaining predictors, i.e., wet-snow indicator, soil type and crop type account for only a negligible share (Figure 2A). When these predictors are removed from the full model, the prediction performance remains almost unchanged (Table 2: reduced model: OOB R2 ≈ 0.41; RMSE ≈2.56) and soil freezing probability explains approximately 75% of the total importance, with snow depth contributing the remainder (Figure 2B). When snow depth is removed and soil freezing is used as sole predictor, the performance is unchanged (Table 2: baseline model: OOB R2 ≈ 0.41; RMSE ≈2.56). This indicates that adding snow depth does not improve the predictive capability beyond what is already captured by soil freezing probability, and that nearly all explainable variation in VHEFTA originates from soil FT changes.

Table 2
www.frontiersin.org

Table 2. Performance of the exploratory Random Forest regression models implemented to identify the physical variables contributing to variations in VHEFTA. The full, reduced, and baseline models were compared using Out-of-Bag (OOB) R2 and RMSE.

Figure 2
Bar charts labeled A and B show feature importance. Panel A includes five features: WetSnow, Soil type, Crop type, Snow depth, and Soil freezing probability, with Soil freezing probability being the most important. Panel B shows two features: Snow depth and Soil freezing probability, with the latter being the most significant.

Figure 2. Random Forest feature importance for VHEFTA variability. (A) Full model including soil freezing probability, snow depth, wet-snow indicator, soil type, and crop type. (B) Reduced model including only soil freezing probability and snow depth.

This preliminary diagnostic analysis thus reveals that VH-based FT indicators are controlled primarily by dielectric effects arising from soil FT processes, with snow depth and wetness contributing only minimally to the overall signal variability. Consequently, the FT prediction models presented in the subsequent sections were developed under this assumption, and snow variables were excluded as potential predictors. Another first insight from this preliminary analysis is the relatively low R2 coefficient obtained, which may arise because soil FT probabilities that are larger than zero (thawed) and smaller than one (frozen) may have a limited correlation with VHEFTA variations. This point is developed further in the next section.

3.2 Spatial and temporal distribution of soil freezing probability

The near surface (5 cm -depth) soil freezing probability derived at each agricultural weather station shows distinct spatial and temporal FT patterns (Figure 3). Soil freezing probabilities in Manitoba, Saskatchewan, and Alberta frequently exceeded 80%, lasting between 4 and 6 continuous months each year from mid-October to late April. The transition to frozen conditions typically began in early November, with thawing starting in early April. Notably, Manitoba exhibited longer spring transition periods compared to the other provinces, often extending up to a month, with soil freezing probabilities ranging from 40 to 60 percent during that time in several years. The two study sites in Québec show reduced freezing durations compared to the other sites. These sites also exhibited a higher frequency of FT transitions during the frozen period, indicating both temporal variability at these locations and spatial heterogeneity when compared to the more consistent FT patterns observed across sites in Manitoba, Saskatchewan, and Alberta.

Figure 3
Multi-panel figure showing where the weather stations are located in the study regions. The panels show stations grouped by province/site: (a) Manitoba, (b) Alberta, (c) Saskatchewan, and (d) the St-Marthe and St-Maurice sites in Québec. Each panel displays station locations as points within the mapped area, showing how stations are spread across each region. Labels and map elements (such as boundaries and a scale) provide geographic context.

Figure 3. Temporal and spatial visualization of freezing probability across four Canadian provinces using soil temperature at 5 cm depth measured at agricultural weather stations (November 2014 to June 2023). The Y-axis indicates the distribution of stations in (a) Manitoba; (b) Alberta; (c) Saskatchewan; (d) St-Marthe and St-Maurice sites (Québec), respectively.

3.3 RF model performance in binary FT classification

To assess the potential of different radar polarizations in capturing FT transitions, classification models were initially evaluated using VH, VV, and VH/VV backscatter as individual predictors. The baseline RF model based on VH backscatter alone achieved the highest performance, with an overall accuracy of 66.64% and an F1-score of 66.80% (Table 3). The VV-based model performed moderately worse (60.18% accuracy, 62.19% F1-score), indicating a weaker sensitivity to FT state detection, possibly due to its lower responsiveness to vegetation structure. Interestingly, the VH/VV ratio, despite integrating both volume and surface scattering behaviors, resulted in the lowest performance (56.66% accuracy, 60.29% F1-score). When combined with ancillary variables such as soil type, crop type, latitude, and altitude, each of these models exhibited a modest yet consistent improvement, with increases in overall accuracy and F1-score of up to 3 percentage points across all three input configurations: VV, VH, and the VH/VV ratio.

Table 3
www.frontiersin.org

Table 3. Summary of RF classification performance using observed binary FT data with optimized hyperparameters across combinations of predictors and indicators. Bold values correspond to results using the VH backscatter indicator, italic values represent VV backscatter results, and regular font refers to the VH/VV ratio. Indicator groups include original VH and VV backscatters as well as the VH/VV ratio (left), and EFTA-derived metrics (right).

To improve classification beyond raw backscatter, we applied the EFTA-derived indicators from each polarization configuration. The VHEFTA -based model showed a substantial performance boost, achieving 81.36% accuracy and an F1-score of 80.32% (Table 3). Models based on VVEFTA and VHEFTA/VVEFTA also outperformed their corresponding raw backscatter inputs; however, their performance remained notably lower than that of VHEFTA, with a maximum accuracy of 78.71%. The integration of ancillary predictors with the EFTA-based models led to only marginal gains-typically less than 1 percentage point, suggesting that soil type, crop type, latitude, and altitude contribute limited additional discriminatory power once the EFTA indicator is applied.

The features importance analysis for the model combining VHEFTA with all the ancillary predictors shows that the VHEFTA index contributes more than 90% of the total feature importance in predicting soil freezing states, indicating its dominant role in the model (Figure 4). Meanwhile, the contributions from the other predictors collectively account for less than 10% of the model’s importance, in accordance with the OOB performance metrics (Table 3). Among these, altitude and latitude show slightly higher relative importance, suggesting some influence of geographical factors on soil freezing predictive capability, albeit modest.

Figure 4
Bar chart showing the relative importance of five predictors: Soil type, Crop type, Latitude, Altitude, and VH_EFTA. VH_EFTA has the highest importance, significantly surpassing the others, which have minimal values.

Figure 4. Feature Importance Plot illustrating the relative importance of each predictor in predicting soil freezing state using a Random Forest classifier model.

The PDPs in Figure 5 show the marginal effect of each predictor on the predicted freezing probabilities from the FT classification model. VHEFTA exhibits a strong and non-linear influence on soil freezing probability, showing its dominant influence on FT prediction (Figure 5A). The average partial dependence line shows that higher values of VHEFTA correlate with a greater probability of frozen states, with a threshold around 2 on the VHEFTA scale separating frozen from thawed conditions (P = 0.5). In contrast, the other predictors exhibit more stable, less important associations with FT state (Figures 5B–E).

Figure 5
Five partial dependence plots labeled A to E, illustrating relationships between variables and response. Each plot shows multiple blue line patterns, with an orange dashed line indicating trends. Axes include different scales and categories such as numerical values and labeled categories like crop types and soil conditions.

Figure 5. Partial dependence analysis of predictors in the FT classification model using VHEFTA and other predictors. (A) VHEFTA; (B) latitude; (C) altitude; (D) crop type; and (E) soil type. The orange dashed line indicates the average partial dependence for each predictor, while the blue lines represent the individual simulations for each predictor value. The spread of the blue lines reflects the variability or uncertainty in the model’s prediction for that predictor. Larger spreads suggest higher variability in the model’s response to changes in the predictor, indicating that the predictor has a more complex or less predictable effect on soil freezing probability.

3.4 Radar-based FT detection and VHEFTA

Although soil and crop types did not significantly enhance model performance, the analysis investigates how VH backscatter and the VHEFTA index respond to surface heterogeneity-particularly variations in soil and crop types-to assess their ability to distinguish between frozen and thawed conditions. However, there is considerable overlap between the two states, particularly in most soil and crop types (Figures 6A,C), which reduces the effectiveness of VH backscatter alone in discriminating FT states. In contrast, the VHEFTA shows a clearer distinction between frozen and thawed conditions compared to VH backscatter signals (Figures 6B,D).

Figure 6
Box plots across four panels (A, B, C, D) show data on VH backscatter and the VH_EFTA indicator under frozen and thawed conditions. Panels (A) and (B) compare different soil types: (A) VH backscatter by soil type and (B) VH_EFTA by soil type. Panels (C) and (D) compare different crop types: (C) VH backscatter by crop type and (D) VH_EFTA by crop type. Outlier cases are provided in Supplementary Figures S2 and S3.

Figure 6. Comparison of radar signal variations and VHEFTA algorithm changes across different crop and soil types in frozen and thawed conditions. (A) VH radar backscatter signal variations across different soil types; (B) VHEFTA indicator variations across different soil types; (C) radar backscatter signal variations across different crop types; (D) VHEFTA indicator variations across different crop types. Note: Outliers cases are illustrated in supplementary material (Supplementary Figures S2, S3).

Outliers in the boxplots were statistically identified using the standard interquartile range (IQR) rule, where values falling below 1.5 times the IQR below the first quartile (Q1) or exceeding 1.5 times the IQR above the third quartile (Q3) were flagged as outliers. A detailed analysis of the occurrence and distribution of outliers in FT detection using VH and VHEFTA predictors is provided in the Supplementary Material (See Supplementary Figures S4, S5).

3.5 Spatial and temporal validation

Spatial cross-validation resulted in an averaged accuracy of 0.82 and F1-score of 0.80, indicating overall consistent model performance and spatial transferability in detecting FT states (Figure 7A). Most stations exhibited a balanced distribution between frozen and thawed states over the October-June period (Figure 7B), however some stations departed from this pattern. Some stations, such as 106 (Manitoba), 136 (Alberta), 141 (Saskatchewan), and 162 (Quebec), show a clear dominance of thawed conditions. These stations have cross-validated accuracy values ranging 0.79–0.87 and F1-scores from 0.77 to 0.87, showing that class imbalance at a given site does not affect model prediction performance.

Figure 7
Graph consisting of two panels. Panel A depicts the accuracy and F1-score across various stations by provinces, using red circles and blue triangles, respectively. Scores fluctuate between 0.700 and 0.900. Panel B illustrates observations as bar charts indicating thawed (red) and frozen (blue) states. Stations are marked on the x-axis, with switches in climatic states highlighted by changing background colors.

Figure 7. Spatial cross-validation results and freeze-thaw data distribution by station. Stations from Manitoba are shaded in light blue, Alberta in light green, Saskatchewan in light pink, and Québec in light orange. (A) Station spatial cross-validation scores (accuracy and F1-score); (B) frequency of freeze and thaw occurrences at each station. The blue and red bars in the background represent the number of frozen (≥0.5) and thawed (<0.5) occurrences, respectively, for each year, revealing the temporal variability of freeze-thaw events.

The mean temporal cross-validation model accuracy and F1-score are 0.81 and 0.80, respectively, indicating overall consistent performance across all years (Figure 8). This stable performance across years highlights the model’s ability to transfer to years not used in model training, despite annual class imbalance having only a slight impact on cross-validated accuracy.

Figure 8
Bar and line chart depicting accuracy and F1-score from 2016 to 2023. Red dots represent accuracy, and blue triangles represent F1-score. Background bars indicate thawed (red) and frozen (blue) observations. Both scores generally stay between 0.79 and 0.83.

Figure 8. Temporal model cross-validation scores and annual FT frequency distribution (October-June period). Model performance is presented using accuracy and F1-score metrics (red and blue dots), while the secondary Y-axis shows the number of thawed (blue) and frozen (red) occurrences for each validation fold (left-out year).

3.6 FT binary and probability-based models to soil freezing probability

The optimized RF regression model using all potential predictors predicted the observed soil freezing probability with an R2 of 0.50 and an RMSE of 0.32 (Figure 9A). Using VHEFTA as sole predictor in the RF regression model yielded comparable performance, with an R2 of 0.46 and an RMSE of 0.31 (Figure 9B). The scatter plots of predicted/observed probabilities against VHEFTA show considerable scatter in observed probabilities compared to model predictions. Observed freezing probabilities are strongly clustered near 0 and 1, corresponding to strong thawed and freezing conditions, while intermediate probability values indicate transitional FT states. Hence the RF regression model has limited ability to accurately predict transitional FT phases but appears better to capture the thawed/frozen extremes.

Figure 9
Scatter plots comparing observed and predicted soil freezing probabilities against a variable V_HEFTA. Panel A shows R-squared value of 0.50 and RMSE of 0.32, while panel B shows R-squared value of 0.46 and RMSE of 0.31. Blue dots represent observed probabilities, and red dots represent predicted probabilities.

Figure 9. Prediction of soil freezing probability. (A) Optimized RF regression model with VHEFTA, Latitude, Station’s Elevation, Crop Type, and Soil Type as predictor variables; (B) Optimized RF regression model with only VHEFTA.

To compare the regression and classification models, the continuous probabilities from the RF regression model were converted into binary FT states. Table 4 presents the metrics for the binarized RF regression model, including both the full-predictor model and the VHEFTA-only model. The model using all predictors achieved an accuracy of 81.66% and an F1-score of 80.26%, demonstrating strong classification capability in distinguishing frozen and thawed conditions. The models using only VHEFTA performed similarly, with an accuracy of 80.79% and an F1-score of 79.36%. When comparing these results to those of the binary classification model (Table 3), the binarized RF regression model performs similarly, but with slightly lower accuracy.

Table 4
www.frontiersin.org

Table 4. Comparison of FT binary and FT probability-based models for soil freezing probability prediction.

3.7 Evaluation of model sensitivity to seasonal and temperature variability

To evaluate seasonal variability in model performance for both regression (soil freezing probability-based) and binary-based FT detection, we computed daily classification error and regression RMSE for each day of the year (DOY) from October 1 (DOY 1) to June 30 (DOY 273). Figure 10 provides a detailed characterization of model errors during transitional FT periods, showing how classification error and regression RMSE sharply increase during fall and spring, as well as around the 0 °C threshold. This targeted analysis enables explicit evaluation of transitional-period performance and supports interpretation of the underlying physical drivers of misclassification. As shown in Figure 10A, the probabilistic-based model exhibits pronounced error peaks during Fall (DOY 40–68) and early Spring (DOY 160–205), where errors are assumed to be significant when exceeding a threshold of 0.4. For the binary-based model, the error peaks are more temporally confined, occurring during Fall (DOY 45–68) and early Spring (DOY 175–205). Outside the transitional periods, particularly in October, January, February, and June, model performance remains relatively stable, with both error metrics consistently falling below 0.25. Binary-based FT detection errors remain slightly lower than probabilistic-based RMSEs across most periods, particularly during non-transitional phases. The distribution of observations throughout the season remains consistent (Figure 10A, gray bars), indicating reliable Sentinel-1 coverage and sufficient data availability across the monitoring period. For further analysis of temperature-related variability, Figure 10B shows classification error (red line) for the binary FT model and regression RMSE (blue line) for the probability model, plotted across soil temperature intervals ( °C). Model errors from both approaches peak around the 0 °C threshold, where the soil transitions between frozen and thawed states, a range that also corresponds to the most substantial presence of observational data. Outside this transition zone, error levels gradually decrease as soil temperatures move farther from the freezing point. Notably, across the full temperature range, errors from the binary-based model remain consistently lower than from the probabilistic-based model, indicating stronger performance. At the extreme soil temperatures, around −22 °C and +32 °C, unexpectedly high model errors are observed, corresponding to low observational data availability.

Figure 10
Chart A shows classification error and regression RMSE over seasonal days, with error peaks around days 50 and 180. Observations are highest around day 150. Chart B displays these metrics against soil temperature. Both metrics peak at extreme temperatures, with observations highest at 0°C.

Figure 10. Seasonal and temperature-dependent variability in model error for classification (binary-based) and regression (probabilistic-based) FT detection models. (A) Daily classification error (red) and regression RMSE (blue) from October 1 (Day 1) to June 30 (Day 273), with observation counts (gray bars) representing the number of Sentinel-1 overpasses aggregated across all stations and years. (B) Error trends as a function of soil temperature. The vertical dashed line denotes the 0 °C threshold used to distinguish frozen and thawed states.

3.8 Web-based on-demand freeze-thaw mapping tool

To generalize the FT retrieval beyond station-based analyses, we developed a web-based FT mapping tool that enables large-scale and user-driven application of the method (Figure 11). Through the Streamlit frontend, users can define the region of interest and select a temporal window restricted to the cold-season period (October 1–June 30), when soil FT transitions occur, choose the spatial resolution (10 m, 30 m, or 100 m), and optionally apply land-cover filters. These inputs trigger a cloud-based processing workflow where the Google Earth Engine (GEE) backend automatically handles data preparation, including Sentinel-1 collection filtering, application of a Refined Lee speckle filter, terrain normalization, image mosaicking, and optional land-cover clipping. When the agricultural-land option is selected, the workflow restricts computation to cropland and related classes including tropical/sub-tropical grassland, temperate/sub-polar grassland, cropland, and barren land using the 2020 Land Cover of North America at 30 m dataset (Commission for Environmental Cooperation (CEC), (2021)) available in GEE. All processing is performed through the GEE API, providing seamless access to the full Sentinel-1 archive and ensuring consistent, scalable FT retrieval.

Figure 11
Flowchart illustrates a processing pipeline integrating a frontend user interface and backend system. Users select dates, resolution, and land cover filters using the Streamlit interface, triggering data processing. Sentinel-1 SAR data undergoes filtering, speckle reduction, normalization, mosaicking, and optional land cover clipping. NALCMS 2020 data aids in EFTA components computation. Pixel-wise FT classification ensues, linking to an RF Freeze-Thaw model. The output includes classified image series, statistical outputs, TIFF maps, and image count overview.

Figure 11. Schematic workflow of the web-based FT Detection Tool. Color conventions: light blue for user-interface components; grey for primary Sentinel-1 SAR data input; cyan for the NALCMS 2020 land-cover layer used for optional clipping; orange for processing modules within the GEE backend; and yellow green for consolidated FT output products. The processing pipeline established within a Python environment in Google Colab.

The workflow then computes the EFTA components and applies the optimized Random Forest FT classifier pixel-wise to all available acquisitions within the user-selected period. The platform returns a chronological series of classified FT maps within the region of interest, summary statistics, and downloadable GeoTIFF layers, enabling rapid visualization and export of results. This cloud-enabled architecture generalizes the VH-based FT detection to any user-defined area, providing an operational framework suitable for regional monitoring, agricultural assessment, and research applications. The tool and supporting Python scripts are openly available through the project’s GitHub repository (https://github.com/Shahab-J/Freeze-Thaw-Detection, accessed 05 December 2025), and the fully deployed web application is accessible at https://freeze-thaw-detection-kmpqcuusaqtf5ypu5h3vyg.streamlit.app/ (accessed 05 December 2025).

4 Discussion

4.1 Implications of the diagnostic assessment of VHEFTA controls

The diagnostic assessment indicates that variations in VHEFTA are primarily governed by near-surface soil freezing conditions, while snow-related variables exert a more limited influence. Although wet snow has been reported to affect C-band radar backscatter through enhanced attenuation and volume scattering (Brangers et al., 2024; Gill et al., 2015), it did not emerge as a dominant control in the present analysis. Although not explicitly tested, the exponential decay structure of EFTA, combined with the seasonal constraint imposed by K, may partially attenuate short-term VH perturbations associated with wet snow during thaw and shoulder periods, thereby reducing the sensitivity of VHEFTA to transient snow wetness effects.

Snow depth appeared as a contributing factor in the feature-importance analysis; however, its inclusion did not enhance predictive performance relative to models based solely on soil freezing probability. This behavior is consistent with the well-established transparency of dry snow at C-band frequencies, which generally limits its direct influence on radar backscatter (Pivot, 2012; Bernier and Fortin, 1998). The apparent importance of snow depth is therefore more likely to reflect its phenological co-occurrence with frozen soil conditions rather than an independent physical control on the radar signal. This interpretation is further supported by a strong positive Spearman rank correlation (See Supplementary Figure S6) between soil freezing probability and snow depth (ρ = 0.79), indicating pronounced seasonal co-occurrence rather than an independent explanatory role of snow depth on VHEFTA variability.

4.2 RF classification performance for observed binary FT

The performance differences observed across VH, VV, and VH/VV-based models can be explained by their respective sensitivity to surface and vegetation characteristics. The model using VH backscatter achieved the highest accuracy, attributed to VH’s strong sensitivity to vegetation structure and surface roughness due to its stronger volume-scattering sensitivity. This makes it more responsive to surface changes associated with FT transitions in agricultural lands. In contrast, VV polarization, which primarily reflects surface scattering and responds better to smooth, bare soil conditions, showed weaker performance. This limits its ability to detect FT-related variations where vegetation and roughness are present. The VH/VV ratio, despite integrating information from both polarizations, exhibited the lowest performance. Its strong vegetation sensitivity and weaker soil-moisture response (Vreugdenhil et al., 2018) educe the dielectric contrast critical for FT detection.

From a physical standpoint, the enhanced performance of VHEFTA over raw VH backscatter can be attributed to the pronounced contrast in dielectric properties between frozen and thawed soils. Liquid water has a high dielectric constant of approximately 80, as reported by Skolunov (1997), while ice exhibits a much lower value near 3.2, according to Matzler and Wegmuller (1987). This strong dielectric contrast significantly affects radar signal interaction, particularly at C-band frequencies during FT transitions. During frozen conditions, the reduction in dielectric constant diminishes surface and volume scattering, resulting in lower radar returns. VH polarization, being more sensitive to volume scattering and surface structure, responds markedly to these changes. The application of EFTA further enhances FT discrimination by reducing signal ambiguity during transitional periods. VHEFTA improves FT state detection by attenuating thaw-related noise through exponential smoothing, particularly during transitional periods when VH signals are affected by surface variability. The logistic exponential decay function applied to the VH signal captures variations during the fully frozen winter period while minimizing ambiguous fluctuations during spring and fall transitions, which are often affected by surface heterogeneity such as roughness, thin water films, and patchy snow cover. By suppressing these transient effects and emphasizing the frozen-phase signal, VHEFTA improves contrast between FT states and enhances detection reliability.

The strong performance of VHEFTA underscores its effectiveness in capturing FT transitions and its importance as the primary indicator for FT classification. While VVEFTA and the VHEFTA/VVEFTA ratio also show improvements compared to the original backscatter indicators, their relative insensitivity to vegetation-induced signal variability limits their effectiveness compared to VHEFTA. These findings are consistent with the results of Fayad et al. (2020), which showed that VH-based detection of frozen agricultural plots is more reliable than VV, due to a greater radar signal decay under freezing conditions, making VH less ambiguous for freeze detection. They also align with Taghipourjavi et al. (2024b), who demonstrated the effective use of Sentinel-1 VH polarization for identifying soil FT transitions in agro-forested areas of southern Québec, Canada. Further comparison of model performance using ROC-AUC scores also supports these findings, with an ROC-AUC score of 88% when using all predictors and 86% when using only VHEFTA (See Supplementary Figure S7).

The feature importance analysis further supported the significance of VHEFTA, which accounted for over 90% of the total contribution in the RF model (Figure 4). Although previous studies highlighted the influence of crop and soil types on radar backscatter during FT events and suggested that adjusting thresholds based on land cover could improve accuracy (Baghdadi et al., 2018), our results indicated a limited contribution from these variables.

The additional predictors, including altitude, latitude, crop type, and soil type, collectively explained less than 10% of the overall model performance. Partial dependence plots further support these findings, revealing a strong, non-linear positive relationship between VHEFTA and soil freezing probability. Although the altitude predictor exhibited a relatively low variable importance (<5%) in our Random Forest model, the decline in frozen state probability beyond 600 m is a result that can be attributed to the presence of only a few observational stations at higher altitudes (See Supplementary Figure S8). While model accuracy varied slightly among crop and soil types, no substantial variation could be attributed to either variable (See Supplementary Figures S9, S10). Accordingly, the RF model exhibits consistent adaptability across multiple crop and soil types, indicating robust generalization in diverse agricultural areas. This is primarily due to the computation of a site-specific thawed reference backscatter value σT for the derivation of VHEFTA (Equation 1), which accounts for spatial and interannual variability in soil or crop roughness effects on reference backscatter values before soil freezing occurs.

4.3 FT discrimination by VH and VHEFTA indicators

Although soil and crop types showed limited contribution to model improvement in FT discrimination, exploring this aspect helps clarify how the VH and VHEFTA predictors perform in distinguishing frozen and thawed conditions under high soil and crop heterogeneity in Canadian agricultural fields. According to Figure 6, VHEFTA consistently provides better separation between FT classes than the raw VH predictor across different soil (Figures 6A,B) and crop types (Figures 6C,D). VH backscatter signals display substantial overlap between frozen and thawed states, especially under variable soil textures and vegetation structures. In contrast, VHEFTA reduces this overlap by attenuating signal noise associated with soil moisture variability and surface roughness. This improvement could be driven by EFTA’s capacity to better capture signal variations during fully frozen winter periods while reducing the impact of unstable fluctuations during transitional phases, compared to the raw VH predictor.

Several outliers in both VH backscatter and VHEFTA predictors can be observed in Figure 6. These appear most frequently during FT transition periods, as illustrated in Supplementary Figure S4. During these transitions, the surface often experiences mixed-phase states, temporary surface water accumulation, or patchy snow cover, which can cause local variations in dielectric properties and backscatter response (Bergstedt and Bartsch, 2017). Although both predictors encounter challenges under such conditions, the VHEFTA predictor still demonstrates a clearer separation between frozen and thawed states. It does, however, exhibit more thawed-classified outliers compared to raw VH, suggesting that increased sensitivity may come with the cost of detecting additional transient variations. Nevertheless, VHEFTA’s enhanced detection capability contributes to the model’s overall higher classification performance. These additional thaw-classified outliers, representing only 7.6% of total data, do not significantly degrade the model’s ability to distinguish between frozen and thawed states.

4.4 Spatial and temporal validation

The FT spatial and temporal variability observed throughout this study can arise from variability in surface roughness, driven by soil conditions and crop types and phenology, as well as fluctuating climate and soil moisture, particularly during spring thaw. Uneven surface conditions resulting from diverse land practices and tillage methods can influence radar backscatter signals and could bias model training and transferability to unseen conditions. However, the spatial cross-validation analysis showed effective spatial model transferability (Figure 7A), even on stations with less observations or imbalanced distribution between thawed and frozen occurrences (Figure 7B). This result aligns with the earlier finding that VHEFTA effectively captures local surface conditions, minimizing reliance on site-specific ancillary predictors to represent the spatial heterogeneity of FT changes. Although the spatial distribution of agricultural stations is uneven across provinces, the model generalized well across the range of agricultural environments represented in the dataset, indicating that the FT classification framework is transferable within croplands, pastures/grasslands, and open agricultural lands despite regional differences in station coverage.

Despite the observed FT temporal variations, the model also transferred well to unseen years, i.e., on years removed from model training (Figure 8). This stable model performance indicates that it effectively adapts to the changing conditions over time, such as variations in surface roughness caused by seasonal soil moisture, diverse crop vegetation, and land management practices like tillage.

4.5 Evaluating binary and probability-based FT prediction models

Our results demonstrate that the RF regression model, once binarized, performs comparably to the direct classification model, suggesting its outputs can still be reliably used to predict frozen and thawed conditions. The regression model, though, does not predict FT probabilities as reliably as it classifies binary states. As illustrated in Figure 9B, while the probabilistic model for FT states is overall effective, many discrepancies are evident, particularly when high VHEFTA values coincide with low freezing probabilities (observations in the lower-right quadrant of Figure 9B). This discrepancy likely arises because even minor changes in soil moisture can significantly alter the soil’s complex permittivity, a phenomenon that becomes especially pronounced during spring thawing. These effects are especially evident during transitional periods, where the EFTA algorithm inherently introduces discrepancies in FT state detections. Specifically, the identification of the most negative backscatter difference prior to February (interpreted as the fall freeze onset) and the most positive difference after February (interpreted as the spring thaw onset) defines the expected frozen and thawed periods within the EFTA framework. During shoulder periods, a small number of thawed or frozen events per study year occurring just before or after these key thresholds result in EFTA that do not fully reflect the actual FT state. For instance, in fall, a few freeze events occurring before the identified most negative backscatter difference are assigned lower EFTA values, despite corresponding to frozen conditions, explaining the presence of freeze-related outliers in the upper-left quadrant of Figure 9B. Similarly, in spring, some thawed events occurring shortly before the detected maximum positive difference receive higher EFTA values, although they represent thawed soil with low freezing probability. These limitations help explain the occurrence of thaw-related outliers in the lower-right quadrant of Figure 9B. Understanding these surface dynamics is essential for interpreting backscatter extremes that guide EFTA’s identification of FT phases. Since surface changes during transitional periods alter radar backscatter by modifying the physical and dielectric properties of the near surface, factors such as spring snowmelt and fall harvest become especially relevant to the interpretation of EFTA values.

For additional insight, the seasonal distribution of model errors highlights the variability in soil surface conditions during FT transitions. In fall, elevated error levels in both models can be attributed to the onset of surface freezing and shifts in dielectric properties caused by declining soil temperatures and early snow accumulation. These changes affect radar backscatter, leading to increased errors in binary-based FT detection and higher RMSEs in FT probability predictions. Similarly, the spring transition exhibits the most prolonged and elevated error values, highlighting the difficulty in capturing the onset of thaw. During this period, near-surface soil moisture increases rapidly due to snowmelt, introducing significant dielectric variability and variable radar backscatter. This variability contributes to greater uncertainty in FT state prediction, particularly for probability-based models. Although both models follow similar seasonal trends, binary-based model errors remain consistently lower outside the transitional periods. This indicates that binary classification provides more stable performance under consistent surface conditions, whereas FT probabilistic predictions respond more variably depending on the complexity of transitional states.

During winter, when soils remain fully frozen for an extended period, as well as in early fall and late spring when soils are fully thawed, backscatter remains relatively stable, leading to reduced binary-based model errors and lower RMSEs in FT probability predictions. However, FT detection uncertainty remains highest during transitional periods, when changing surface roughness, increasing soil moisture, and post-harvest surface alterations such as crop residue and tillage practices especially under snowmelt conditions, introduce radar backscatter variability and complicate FT state interpretation. As shown in Figure 10B, both classification errors and regression RMSEs peak near the 0 °C threshold, indicating reduced model reliability during late fall and early spring. During these periods, partial soil freezing, rising soil moisture from snowmelt, and evolving surface conditions intensify signal changes. Soil moisture, in particular, directly influences the dielectric constant and thus radar backscatter, making it a critical driver of FT detection error. Seasonal wetting and drying strongly impact the EFTA algorithm, which identifies the most extreme backscatter differences to detect frozen and thawed periods. Accurate interpretation of FT transitions, especially during shoulder seasons, would benefit from concurrent soil moisture observations. However, such data were not available across the 174 agricultural weather stations used in this study, limiting direct validation of moisture-related influences on EFTA responses. Future research should integrate co-located soil moisture observations and conduct statistical analyses to examine their relationship with radar backscatters during transitional periods, to assess their role in refining FT classification and prediction based on both binary and probabilistic approaches.

Outside the critical 0 °C transition range, model performance improves considerably. In colder conditions (below −5 °C), soils remain consistently frozen with minimal surface variability, resulting in lower model errors. Similarly, temperatures above +5 °C correspond to fully thawed states with more stable dielectric properties, further enhancing model performance. Throughout all temperature ranges, classification models exhibit lower error values than regression models, indicating greater robustness to short-term radar backscatter variations. However, at the extreme ends of the temperature spectrum (below −20 °C and above +30 °C), two noticeable spikes in error appear, which can be primarily attributed to the limited number of observations available under these conditions.

To support broader applicability, this study leveraged in situ weather stations distributed across diverse climatic regimes, as illustrated in the Köppen climate classification map (See Supplementary Figure S1). These stations primarily span hot-summer humid continental and cold semi-arid zones, with limited representation from subarctic regions. Together, these zones encompass the dominant agro-climatic settings across Canadian agricultural lands, reinforcing the relevance of the proposed approach at a national scale. Given the climatic representativeness of these regions, the proposed FT detection method shows strong potential for transferability to other FT-affected agricultural regions with similar climatic and land cover characteristics. However, its application in areas marked by distinct cryospheric and topographic conditions such as mountainous environments with steep orographic gradients, variable snowpack dynamics, and heterogeneous surface structures-would require further algorithmic optimizations and regional calibration to ensure reliable FT state detection.

4.6 Translating findings into an operational FT retrieval framework

The strong and consistent performance of EFTA derived from VH backscatter, explaining more than 90% of the predictive importance and outperforming all other radar- and ancillary-based indicators provided a clear basis for adopting it as the core input for operational FT mapping. Its demonstrated robustness across diverse soil textures, crop types, and climatic conditions supports a streamlined modelling framework that avoids dependence on weakly influential ancillary variables. Accordingly, the web-based FT detection platform was designed around VHEFTA as the main radar-derived indicator, enabling efficient pixel-wise FT classification across Canadian croplands, pastures, grasslands, and barren lands. This architecture ensures full scalability within the GEE environment and supports user-driven mapping across any region of interest without site-specific recalibration.

Although the web-based FT mapping tool developed in this study provides an operational platform for on-demand retrieval of FT states from historical Sentinel-1 data, it is not designed to predict future FT conditions. All outputs reflect observed SAR acquisitions within the selected period, rather than forecasted FT states. Future research could therefore focus on developing predictive FT modelling frameworks, where machine-learning or physics-informed approaches integrate meteorological forecasts, soil-temperature projections, or climate-scenario data to anticipate upcoming FT transitions. The current modelling and validation rely mainly on stations located in croplands, pastures/grasslands, and barren lands. A useful extension would be to test, adapt and calibrate the FT detection framework for forested regions, where canopy structure and stronger volume scattering may require adjustments to the retrieval approach.

5 Conclusion

This study employed RF classification and regression models to predict soil FT states over Canadian agricultural fields, using both binary FT and probabilistic FT indicators derived from soil temperature data. The analysis confirmed the dominance of VHEFTA as the primary predictor, reinforcing its ability to capture radar backscatter variations linked to FT transitions. VHEFTA demonstrated the highest reliability in detecting FT transitions due to its sensitivity to surface roughness and vegetation structure. In contrast, VV- and VH/VV-based indicators showed reduced effectiveness, reflecting their limited interaction with heterogeneous surface conditions. Additional predictors such as soil type, crop type, latitude, and altitude provided only marginal improvements, indicating that VHEFTA effectively captures the dominant site-specific influences on radar backscatter.

By incorporating snow depth and a wet-snow indicator into the diagnostic analysis, we confirmed that snow processes exert only a minor influence on VHEFTA compared to soil freezing probability. Although snow-related variables contributed measurably to backscatter variability, their limited predictive value indicates that VHEFTA remains primarily governed by near-surface thermal conditions across the agricultural landscapes studied.

While the probability-based regression model, once binarized, achieved classification accuracy similar to that of the RF classification model, transitional FT states remained more complex to capture. These difficulties were most evident during shoulder periods, where fluctuations in radar signals-driven by combinations of soil moisture variation, snow presence, and evolving surface roughness-introduced greater uncertainty in prediction. Model performance was reduced near the 0 °C threshold, likely due to rapid changes in surface roughness and near-surface wetness; however, the extent of their influence on FT detection accuracy remains to be quantified. FT detection uncertainty was largely confined to transitional periods marked by pronounced soil surface changes, and classification models demonstrated greater robustness compared to regression approaches under such conditions. Improving FT algorithms to better handle signal instability around the freezing point remains essential for advancing prediction accuracy.

Despite these limitations, the overall consistency in performance across diverse agricultural settings suggests that the approach is broadly transferable to other cold-region farmlands. Given the climatic representativeness of the study areas, the proposed method demonstrates strong potential for application in other agricultural regions affected by FT events. However, additional calibration and regional adjustment may be necessary for regions with complex snowpack variability, strong altitude gradients, or heterogeneous soil surface conditions.

Data availability statement

The datasets supporting the conclusions of this article are publicly available in Borealis at DOI: https://doi.org/10.5683/SP3/ZJ03SX. Further inquiries can be directed to the corresponding author.

Author contributions

ST: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft. CK: Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – review and editing. AR: Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – review and editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This study was financially supported by Fonds de Recherche du Québec Nature et Technologies (FRQNT) under the Merit scholarship program for foreign students (PBEEE) program for doctoral research scholarships (File number: 275403; DOI: https://doi.org/10.69777/275403), Canadian Space Agency FAST (19FAQCTB19), and the FRQNT Team Research Project grant 2022-PR-299776 (C. Kinnard).

Acknowledgements

We would like to express our sincere gratitude to Madam Alison Sass at Manitoba Agriculture and Resource Development for kindly providing us with Manitoba’s soil temperature data from all available stations over the full available period. As per the Data Use Agreement, the data is acknowledged as coming from the Manitoba Agriculture Weather Program as part of Manitoba Agriculture. Her support and assistance were invaluable in enabling this research, and we deeply appreciate her contribution to the data collection process.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was used in the creation of this manuscript. The authors declare that generative artificial intelligence (OpenAI’s GPT-4) was used to assist in the grammatical correction of text and in generating alternative phrasing for certain sentences within this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frsen.2025.1728399/full#supplementary-material

References

Aldrich, C. (2020). Process variable importance analysis by use of random forests in a shapley regression framework. Minerals 10, 420. doi:10.3390/min10050420

CrossRef Full Text | Google Scholar

Baghdadi, N., Bazzi, H., El Hajj, M., and Zribi, M. (2018). Detection of frozen soil using Sentinel-1 SAR data. Remote Sens. 10, 1182. doi:10.3390/rs10081182

CrossRef Full Text | Google Scholar

Beer, C., Lucht, W., Gerten, D., Thonicke, K., and Schmullius, C. (2007). Effects of soil freezing and thawing on vegetation carbon density in siberia: a modeling analysis with the Lund-Potsdam-Jena Dynamic Global Vegetation Model (LPJ-DGVM). Glob. Biogeochem. Cycles 21. doi:10.1029/2006GB002760

CrossRef Full Text | Google Scholar

Benninga, H.-J. F., Van Der Velde, R., and Su, Z. (2019). Impacts of radiometric uncertainty and weather-related surface conditions on soil moisture retrievals with Sentinel-1. Remote Sens. 11, 2025. doi:10.3390/rs11172025

CrossRef Full Text | Google Scholar

Bergstedt, H., and Bartsch, A. (2017). Surface state across scales; temporal and spatial patterns in land surface freeze/thaw dynamics. Geosci 7, 65. doi:10.3390/geosciences7030065

CrossRef Full Text | Google Scholar

Bernier, M., and Fortin, J.-P. (1998). The potential of times series of C-band SAR data to monitor dry and shallow snow cover. IEEE Trans. Geosci. Remote Sens. 36, 226–243. doi:10.1109/36.655332

CrossRef Full Text | Google Scholar

Black, T., Chen, W., Barr, A., Arain, M., Chen, Z., Nesic, Z., et al. (2000). Increased carbon sequestration by a boreal deciduous forest in years with a warm spring. Geophys. Res. Lett. 27, 1271–1274. doi:10.1029/1999GL011234

CrossRef Full Text | Google Scholar

Brangers, I., Marshall, H.-P., De Lannoy, G., Dunmire, D., Mätzler, C., and Lievens, H. (2024). Tower-based C-band radar measurements of an alpine snowpack. TC 18, 3177–3193. doi:10.5194/tc-18-3177-2024

CrossRef Full Text | Google Scholar

Breiman, L. (2001). Random forests. Mach. Learn. 45, 5–32. doi:10.1023/A:1010933404324

CrossRef Full Text | Google Scholar

Cade-Menun, B. J., Bell, G., Baker-Ismail, S., Fouli, Y., Hodder, K., Mcmartin, D. W., et al. (2013). Nutrient loss from Saskatchewan cropland and pasture in spring snowmelt runoff. Can. J. Soil Sci. 93, 445–458. doi:10.4141/cjss2012-042

CrossRef Full Text | Google Scholar

Chang, Z., Qi, P., Zhang, G., Sun, Y., Tang, X., Jiang, M., et al. (2022). Latitudinal characteristics of frozen soil degradation and their response to climate change in a high-latitude water tower. Catena 214, 106272. doi:10.1016/j.catena.2022.106272

CrossRef Full Text | Google Scholar

Cohen, J., Lemmetyinen, J., Ruiz, J. J., Rautiainen, K., Ikonen, J., Kontu, A., et al. (2024). Detection of soil and canopy freeze/thaw state in the boreal region with L and C Band Synthetic Aperture Radar. Remote Sens. Environ. 305, 114102. doi:10.1016/j.rse.2024.114102

CrossRef Full Text | Google Scholar

Commission for Environmental Cooperation (Cec). (2021). North American land cover, 2020 (Landsat, 30m). Montreal, Canada: north American Land Change Monitoring system (NALCMS); Canada Centre for mapping and Earth observation (CCMEO), USGS, INEGI, CONABIO, CONAFOR. Available online at: https://www.cec.org/north-american-environmental-atlas/land-cover-30m-2020/.

Google Scholar

Donahue, K., Kimball, J. S., Du, J., Bunt, F., Colliander, A., Moghaddam, M., et al. (2023). Deep learning estimation of northern hemisphere soil freeze-thaw dynamics using satellite multi-frequency microwave brightness temperature observations. Front. Big Data 6, 1243559. doi:10.3389/fdata.2023.1243559

PubMed Abstract | CrossRef Full Text | Google Scholar

Ekici, A., Chadburn, S., Chaudhary, N., Hajdu, L., Marmy, A., Peng, S., et al. (2015). Site-level model intercomparison of high latitude and high altitude soil thermal dynamics in tundra and barren landscapes. TC 9, 1343–1361. doi:10.5194/tc-9-1343-2015

CrossRef Full Text | Google Scholar

Fayad, I., Baghdadi, N., Bazzi, H., and Zribi, M. (2020). Near real-time freeze detection over agricultural plots using Sentinel-1 data. Remote Sens. 12, 1976. doi:10.3390/rs12121976

CrossRef Full Text | Google Scholar

Fox, E. W., Hill, R. A., Leibowitz, S. G., Olsen, A. R., Thornbrugh, D. J., and Weber, M. H. (2017). Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology. Environ. Monit. Assess. 189, 316. doi:10.1007/s10661-017-6025-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Gao, H., Zhang, W., and Chen, H. J. R. S. (2018). An improved algorithm for discriminating soil freezing and thawing using AMSR-E and AMSR2 soil moisture products. Remote Sens. 10, 1697. doi:10.3390/rs10111697

CrossRef Full Text | Google Scholar

Gill, J. P., Yackel, J. J., Geldsetzer, T., and Fuller, M. C. (2015). Sensitivity of C-band synthetic aperture radar polarimetric parameters to snow thickness over landfast smooth first-year sea ice. Remote Sens. Environ. 166, 34–49. doi:10.1016/j.rse.2015.06.005

CrossRef Full Text | Google Scholar

Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24, 44–65. doi:10.1080/10618600.2014.907095

CrossRef Full Text | Google Scholar

Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., and Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 202, 18–27. doi:10.1016/j.rse.2017.06.031

CrossRef Full Text | Google Scholar

Gray, D., Landine, P., and Granger, R. (1985). Simulating infiltration into frozen prairie soils in streamflow models. Can. J. Earth Sci. 22, 464–472. doi:10.1139/e85-045

CrossRef Full Text | Google Scholar

Hou, R., Wang, L., O'connor, D., Tsang, D. C., Rinklebe, J., and Hou, D. (2020). Effect of immobilizing reagents on soil Cd and Pb lability under freeze-thaw cycles: implications for sustainable agricultural management in seasonally frozen land. Environ. Int. 144, 106040. doi:10.1016/j.envint.2020.106040

PubMed Abstract | CrossRef Full Text | Google Scholar

Ireson, A., Van Der Kamp, G., Ferguson, G., Nachshon, U., and Wheater, H. (2013). Hydrogeological processes in seasonally frozen northern latitudes: understanding, gaps and challenges. Hydrogeol. J. 21, 53–66. doi:10.1007/s10040-012-0916-5

CrossRef Full Text | Google Scholar

Jagdhuber, T., Stockamp, J., Hajnsek, I., and Ludwig, R. (2014). Identification of soil freezing and thawing states using SAR polarimetry at C-band. Remote Sens. 6, 2008–2023. doi:10.3390/rs6032008

CrossRef Full Text | Google Scholar

Johnston, J. M., Houser, P. R., Maggioni, V., Kim, R. S., and Vuyovich, C. (2021). Informing improvements in freeze/thaw state classification using subpixel temperature. IEEE Trans. Geosci. Remote Sens. 60, 1–19. doi:10.1109/TGRS.2021.3099292

CrossRef Full Text | Google Scholar

Karthikeyan, L., Pan, M., Wanders, N., Kumar, D. N., and Wood, E. F. (2017). Four decades of microwave satellite soil moisture observations: part 1. A review of retrieval algorithms. Adv. Water Resour. 109, 106–120. doi:10.1016/j.advwatres.2017.09.006

CrossRef Full Text | Google Scholar

Kim, Y., Kimball, J. S., Mcdonald, K. C., and Glassy, J. (2010). Developing a global data record of daily landscape freeze/thaw status using satellite passive microwave remote sensing. IEEE Trans. Geosci. Remote Sens. 49, 949–960. doi:10.1109/TGRS.2010.2070515

CrossRef Full Text | Google Scholar

Lund, J., Forster, R. R., Deeb, E. J., Liston, G. E., Skiles, S. M., and Marshall, H.-P. (2022). Interpreting Sentinel-1 SAR backscatter signals of snowpack surface melt/freeze, warming, and ripening, through field measurements and physically-based SnowModel. Remote Sens. 14, 4002. doi:10.3390/rs14164002

CrossRef Full Text | Google Scholar

Manickam, S., and Barros, A. (2020). Parsing synthetic aperture radar measurements of snow in complex terrain: scaling behaviour and sensitivity to snow wetness and landcover. Remote Sens. 12, 483. doi:10.3390/rs12030483

CrossRef Full Text | Google Scholar

Matzler, C., and Wegmuller, U. (1987). Dielectric properties of freshwater ice at microwave frequencies. J. Phys. D. Appl. Phys. 20, 1623–1630. doi:10.1088/0022-3727/20/12/013

CrossRef Full Text | Google Scholar

Mavrovic, A., Pardo Lara, R., Berg, A., Demontoux, F., Royer, A., and Roy, A. (2021). Soil dielectric characterization during freeze–thaw transitions using L-band coaxial and soil moisture probes. Hydrol. Earth Syst. Sci. 25, 1117–1131. doi:10.5194/hess-25-1117-2021

CrossRef Full Text | Google Scholar

Mccoll, K. A., Roy, A., Derksen, C., Konings, A. G., Alemohammed, S. H., and Entekhabi, D. (2016). Triple collocation for binary and categorical variables: application to validating landscape freeze/thaw retrievals. Remote Sens. Environ. 176, 31–42. doi:10.1016/j.rse.2016.01.010

CrossRef Full Text | Google Scholar

Mcnairn, H., and Brisco, B. (2004). The application of C-band polarimetric SAR for agriculture: a review. Can. J. Remote Sens. 30, 525–542. doi:10.5589/m03-069

CrossRef Full Text | Google Scholar

Moradi, M., Cho, E., Jacobs, J. M., and Vuyovich, C. M. (2023). Seasonal soil freeze/thaw variability across North America via ensemble land surface modeling. Cold Reg. Sci. Technol. 209, 103806. doi:10.1016/j.coldregions.2023.103806

CrossRef Full Text | Google Scholar

Moradi, M., Kraatz, S., Johnston, J., and Jacobs, J. M. J. R. S. (2024). Comparing three freeze-thaw schemes using C-Band radar data in Southeastern New Hampshire. USA 16, 2784. doi:10.3390/rs16152784

CrossRef Full Text | Google Scholar

Mugunthan, J. S., Duguay, C. R., and Zakharova, E. (2023). Machine learning based classification of lake ice and open water from Sentinel-3 SAR altimetry waveforms. Remote Sens. Environ. 299, 113891. doi:10.1016/j.rse.2023.113891

CrossRef Full Text | Google Scholar

Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., et al. (2021). ERA5-Land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383. doi:10.5194/essd-13-4349-2021

CrossRef Full Text | Google Scholar

Niu, G.-Y., and Yang, Z.-L. (2006). Effects of frozen soil on snowmelt runoff and soil water storage at a continental scale. J. Hydrometeorol. 7, 937–952. doi:10.1175/JHM538.1

CrossRef Full Text | Google Scholar

Osei, A. K., Rezanezhad, F., and Oelbermann, M. (2024). Impact of freeze-thaw cycles on greenhouse gas emissions in marginally productive agricultural land under different perennial bioenergy crops. J. Environ. Manage. 357, 120739. doi:10.1016/j.jenvman.2024.120739

PubMed Abstract | CrossRef Full Text | Google Scholar

Outcalt, S. I., Nelson, F. E., and Hinkel, K. M. (1990). The zero-curtain effect: heat and mass transfer across an isothermal region in freezing soil. Water Resour. Res. 26, 1509–1516. doi:10.1029/WR026i007p01509

CrossRef Full Text | Google Scholar

Peng, X., Zhang, T., Cao, B., Wang, Q., Wang, K., Shao, W., et al. (2016). Changes in freezing-thawing index and soil freeze depth over the Heihe River Basin, western China. AAAR 48, 161–176. doi:10.1657/AAAR00C-13-127

CrossRef Full Text | Google Scholar

Pivot, F. C. (2012). C-band SAR imagery for snow-cover monitoring at Treeline, churchill, Manitoba, Canada. Remote Sens. 4, 2133–2155. doi:10.3390/rs4072133

CrossRef Full Text | Google Scholar

Richardson, E., Trevizani, R., Greenbaum, J. A., Carter, H., Nielsen, M., and Peters, B. (2024). The receiver operating characteristic curve accurately assesses imbalanced datasets. Patterns 5, 100994. doi:10.1016/j.patter.2024.100994

PubMed Abstract | CrossRef Full Text | Google Scholar

Saarela, M., and Jauhiainen, S. (2021). Comparison of feature importance measures as explanations for classification models. SN Appl. Sci. 3, 272. doi:10.1007/s42452-021-04148-9

CrossRef Full Text | Google Scholar

Schaufler, S., Bauer-Marschallinger, B., Hochstöger, S., and Wagner, W. (2018). Modelling and correcting azimuthal anisotropy in Sentinel-1 backscatter data. Remote Sens. Lett. 9, 799–808. doi:10.1080/2150704X.2018.1480071

CrossRef Full Text | Google Scholar

Schuur, E. A., Mcguire, A. D., Schädel, C., Grosse, G., Harden, J. W., Hayes, D. J., et al. (2015). Climate change and the permafrost carbon feedback. Nat 520, 171–179. doi:10.1038/nature14338

PubMed Abstract | CrossRef Full Text | Google Scholar

Schuur, E. A., Abbott, B. W., Commane, R., Ernakovich, J., Euskirchen, E., Hugelius, G., et al. (2022). Permafrost and climate change: carbon cycle feedbacks from the warming Arctic. Annu. Rev. Environ. Resour. 47, 343–371. doi:10.1146/annurev-environ-012220-011847

CrossRef Full Text | Google Scholar

Shao, W., and Zhang, T. (2020). Assessment of four near-surface soil Freeze/Thaw detection algorithms based on calibrated passive microwave remote sensing data over China. ESS 7, e2019EA000807. doi:10.1029/2019EA000807

CrossRef Full Text | Google Scholar

Skolunov, A. (1997). Frequency-temperature curve of the complex dielectric constant and refractive index of water. Fibre Chem. 29, 367–373. doi:10.1007/BF02418871

CrossRef Full Text | Google Scholar

Sun, L., Chang, X., Yu, X., Jia, G., Chen, L., Wang, Y., et al. (2021). Effect of freeze-thaw processes on soil water transport of farmland in a semi-arid area. Agric. Water Manag. 252, 106876. doi:10.1016/j.agwat.2021.106876

CrossRef Full Text | Google Scholar

Taghipourjavi, S., Kinnard, C., and Roy, A. (2024a). Data from: in-situ soil temperature data (2 and 10 cm) in agro-forested areas of St-Marthe and St-Maurice for 2020-21 and 2021-22. doi:10.5683/SP3/LGLCKW

CrossRef Full Text | Google Scholar

Taghipourjavi, S., Kinnard, C., and Roy, A. (2024b). Sentinel-1-Based soil freeze–thaw detection in agro-forested areas: a case Study in Southern Québec, Canada. Remote Sens. 16, 1294. doi:10.3390/rs16071294

CrossRef Full Text | Google Scholar

Tetlock, E., Toth, B., Berg, A., Rowlandson, T., and Ambadan, J. T. (2019). An 11-year (2007–2017) soil moisture and precipitation dataset from the Kenaston network in the Brightwater Creek basin, Saskatchewan, Canada. Earth Syst. Sci. Data 11, 787–796. doi:10.5194/essd-11-787-2019

CrossRef Full Text | Google Scholar

Vreugdenhil, M., Wagner, W., Bauer-Marschallinger, B., Pfeil, I., Teubner, I., Rüdiger, C., et al. (2018). Sensitivity of Sentinel-1 backscatter to vegetation dynamics: an Austrian case study. Remote Sens. 10, 1396. doi:10.3390/rs10091396

CrossRef Full Text | Google Scholar

Wackerly, D. D. (2008). Mathematical statistics with applications. 7th Edition ed. USA: Thomson Learning, Inc.

Google Scholar

Walker, V. A., Colliander, A., and Kimball, J. S. (2022). Satellite retrievals of probabilistic freeze-thaw conditions from SMAP and AMSR brightness temperatures. IEEE Trans. Geosci. Remote Sens. 60, 1–11. doi:10.1109/TGRS.2022.3174807

CrossRef Full Text | Google Scholar

Wei, X., Huang, C., Wei, N., Zhao, H., He, Y., and Wu, X. (2019). The impact of freeze–thaw cycles and soil moisture content at freezing on runoff and soil loss. LDD 30, 515–523. doi:10.1002/ldr.3243

CrossRef Full Text | Google Scholar

Wu, X., Dong, Z., Jin, S., He, Y., Song, Y., Ma, W., et al. (2020). First measurement of soil freeze/thaw cycles in the Tibetan Plateau using CYGNSS GNSS-R data. Remote Sens. 12, 2361. doi:10.3390/rs12152361

CrossRef Full Text | Google Scholar

Wu, S., Zhao, T., Pan, J., Xue, H., Zhao, L., and Shi, J. (2022). Improvement in modeling soil dielectric properties during freeze-thaw transitions. IEEE Geosci. Remote Sens. Lett. 19, 1–5. doi:10.1109/LGRS.2022.3154291

CrossRef Full Text | Google Scholar

Yu, Z., Wang, W., Li, C., Liu, W., and Yang, J. (2018). Speckle noise suppression in SAR images using a three-step algorithm. Sens 18, 3643. doi:10.3390/s18113643

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhong, W., Yuan, Q., Liu, T., and Yue, L. (2022). Freeze/thaw onset detection combining SMAP and ASCAT data over Alaska: a machine learning approach. J. Hydrol. 605, 127354. doi:10.1016/j.jhydrol.2021.127354

CrossRef Full Text | Google Scholar

Keywords: Sentinel-1 SAR VH backscatter, Soil freezing binary and probability, Random forest models, Exponential freeze-thaw algorithm (EFTA), Interactive online FT detection tool, Canadian agricultural fields

Citation: Taghipourjavi S, Kinnard C and Roy A (2026) On demand machine learning-driven surface freeze-thaw retrieval across Canadian agricultural regions using Sentinel-1 SAR data. Front. Remote Sens. 6:1728399. doi: 10.3389/frsen.2025.1728399

Received: 19 October 2025; Accepted: 29 December 2025;
Published: 26 January 2026.

Edited by:

Amen Al-Yaari, Université Paris-Sorbonne, France

Reviewed by:

Jiangfeng Lv, Jilin University, China
Mahsa Moradikhaneghahi, University of New Hampshire, United States

Copyright © 2026 Taghipourjavi, Kinnard and Roy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shahabeddin Taghipourjavi, c2hhaGFiZWRkaW4udGFnaGlwb3VyamF2aUB1cXRyLmNh

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.