Development of intensity-duration-frequency curves using machine learning and satellite-derived precipitation data

Dargham, Elias; Andraos, Cynthia

doi:10.3389/frwa.2026.1727182

ORIGINAL RESEARCH article

Front. Water, 29 January 2026

Sec. Water and Artificial Intelligence

Volume 8 - 2026 | https://doi.org/10.3389/frwa.2026.1727182

This article is part of the Research TopicAdvancing Machine Learning for Climate and Water Resilience: Techniques for Precipitation ForecastingView all 3 articles

Development of intensity-duration-frequency curves using machine learning and satellite-derived precipitation data

Elias Dargham

Cynthia Andraos^*

Regional Center for Water and Environment, Faculty of Engineering, Saint Joseph University of Beirut, Beirut, Lebanon

Intensity-Duration-Frequency (IDF) curves describe the relationship between rainfall intensity, the duration of rainfall events, and the frequency with which these events occur at a specific location. IDF curves relate rainfall intensity, duration, and frequency using one of various statistical methods to support flood risk management and infrastructure design. However, these traditional statistical methods struggle to capture the growing variability and uncertainty in extreme precipitation. This study leverages satellite-based precipitation datasets and advanced machine learning techniques as an alternative to these statistical methods to develop more accurate and robust IDF curves, thereby lowering uncertainty and improving the reliability of construction under non-stationary rainfall trends. For this, daily precipitation data are collected from Global Precipitation Measurement (GPM) satellite observations over Beirut, Lebanon which are then subsequently disaggregated into multiple finer temporal resolutions. Maximum rainfall values are derived from these data points to develop machine learning and deep learning architectures, including Support Vector Regression (SVR), Artificial Neural Networks (ANN), novel Temporal Convolutional Networks (TCN), and a TCN variant enhanced with sparse self-attention mechanisms (TCAN), which learn distributions used to generate IDF curves. TCAN was able to explain almost all the variance, followed, respectively, by TCN, ANN, SVM then the Gumbel statistical method. The findings highlight how adaptive ML-based models can improve explained variance under variable precipitation patterns, delivering more reliable and robust IDF curves.

1 Introduction

Rainfall intensity-duration-frequency (IDF) curves form a fundamental procedure in hydrological engineering for analyzing precipitation patterns and designing water infrastructure (Collalti et al., 2024; Sherman, 1931). This process establishes critical relationships between the intensity of rainfall events, their duration, and frequency of occurrence, serving as the central component for modern water resource management and infrastructure design (Soto-Escobar et al., 2025). IDF curves have evolved from early empirical formulations in the 1930s to sophisticated statistical tools that are essential for flood forecasting, urban drainage design, and climate resilience planning (Bell, 1969). IDF curves, therefore, play a pivotal role in protecting critical infrastructure against water-related hazards, flood risks and the uncertainties of a changing climate (Martel et al., 2021).

The construction of accurate IDF curves requires high-resolution temporal precipitation data, particularly at sub-hourly intervals. Such data is critical for the design of urban drainage systems and for robust flood risk assessments, as short-duration extreme rainfall events can cause unforeseen natural disasters, potentially resulting in significant loss of life and considerable economic costs (Marra et al., 2024). In many regions, particularly within developing countries, the limited density of rain-gauge networks constrains spatial coverage which consequently compromises the accuracy of rainfall estimates (Basumatary and Sil, 2016) and also poses considerable challenges, as it hinders the acquisition of sub-hourly precipitation measurements required for the reliable construction, often necessitating the use of suboptimal IDF curves (Gebrechorkos et al., 2024). The continuous advancement in satellite precipitation retrieval makes it increasingly important to develop methods that incorporate these datasets in constructing IDF curves—especially in regions where in-situ rainfall observations are sparse and have insufficient record length—as it offers a credible and often superior alternative source for sub-hourly precipitation information in data-sparse regions (Nguyen et al., 2018). Satellite-based precipitation products such as the Global Precipitation Measurement (GPM) Integrated Multi-satellitE Retrievals for GPM (IMERG), the Global Satellite Mapping of Precipitation Moving Vector with Kalman Filter (GSMaP-MVK), the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN), and others have demonstrated effectiveness in estimating IDF curves—some showing strong correlation with ground observations—while half-hourly GPM-IMERG data, in particular, has been used to derive regional IDF curves with reasonable accuracy, especially when employing the Gumbel distribution (Lau and Behrangi, 2022).

The statistical distribution functions used to construct IDF curves have been extensively studied and proven effective for infrastructure design under stationary climate assumptions; however, growing variability and uncertainty in precipitation patterns due to climate change raise concerns about the adequacy of this approach in current urban planning and infrastructure, as reliance on stationarity may underestimate extreme precipitation by up to 60% (Cheng and AghaKouchak, 2014). This has led to a growing interest to develop IDF construction methods that no longer rely on the stationarity assumption but instead take into consideration the changing climate and precipitation patterns using methods such as Bayesian inference or other distributions entirely (Gruss et al., 2020; Shehu and Haberlandt, 2023; Ulrich et al., 2021). Beyond the stationarity challenges, several fundamental uncertainties inherent in the statistical modeling approach further complicate IDF curve construction. Parameter estimation uncertainty from limited historical precipitation records leads to large sampling errors in distribution parameters, with confidence intervals often reaching ±20% of expected design rainfall intensities (Wang and McBean, 2014). Likewise, model selection uncertainty occurs because different statistical distributions (Gumbel, GEV, Log-Pearson Type III) produce significantly different quantile estimates for the same data, particularly for longer return periods (Wehner et al., 2024). Furthermore, quantile estimation uncertainty becomes pronounced for extreme return periods, where extrapolation beyond observed data introduces substantial uncertainty, with confidence interval widths increasing dramatically and estimates being highly sensitive to distribution shape parameters (Milojevic et al., 2023).

Machine learning (ML) and deep learning (DL) methods have recently started to appear in literature as an alternative to the traditional statistical methods used in constructing IDF curves (Acar et al., 2008); a trend that still needs to gain traction, and more studies are needed to explore the potential of these methods. DL models were shown to inherently capture non-linear relationships in data and are capable to estimate any function or distribution given enough data. This is a fundamental concept in DL literature known as the Universal Approximation Theorem (UAT) (Goodfellow et al., 2016). However, artificial neural networks on their own still struggle with modeling stochastic processes and uncertainty inherent in some datasets (Thacker et al., 2020), leading to the development of neural network architectures that are specifically designed to handle stochastic processes and uncertainty, such as Bayesian Neural Networks (BNN) and Temporal Convolutional Attention Networks (TCANs) (Goan and Fookes, 2020; Lin et al., 2021). ML techniques were initially applied to precipitation data preprocessing and temporal disaggregation tasks rather than direct IDF curve construction, while artificial neural networks first emerged in hydrology applications during the early 1990s, primarily focusing on rainfall-runoff modeling, stream flow forecasting, and water quality analysis (Govindaraju, 2000). Early applications of neural networks to precipitation disaggregation began appearing in the literature around 2000, with studies exploring the use of feed-forward neural networks and competitive learning algorithms to disaggregate hourly rainfall data into sub-hourly time increments (Burian et al., 2000). Using machine learning and deep learning methods to construct IDF curves is a relatively new approach to achieve adaptive IDF curves, with only a handful of studies that have emerged recently exploring this alternative. Using the daily rainfall data as input, and the observed sub-daily resolutions as labels, the ML models learn to estimate the downscaled precipitation data, which then can be used to construct IDF curves for different durations using the known distributions (Hu and Ayyub, 2019). This process however, still used machine learning in the preprocessing of the IDF generation process rather than the alternative method while relying on data from multiple ground stations. More recent work, which compared novel machine learning models to traditional statistical methods, have shown promising results in being able to use the former as an alternative to construct IDF curves where deep learning models—especially Long-Short Term Memory (LSTM) and Recurrent Neural Networks (RNN)—were shown to be a more accurate and reliable approach for predicting rainfall patterns and IDF curves than conventional methods (Ameen et al., 2025).

Despite these promising developments in applying machine learning to IDF curve construction, several critical limitations remain unaddressed. Most existing ML approaches have primarily focused on temporal disaggregation tasks, and the only efforts to directly use these methods in the construction process have been limited to simple or outdated neural network architectures that cannot fully capture the complex temporal dependencies inherent in precipitation data. Furthermore, the integration of satellite-based precipitation data with machine learning methods—in addition to more advanced models such as temporal convolutional networks (TCNs) and attention mechanisms—to directly construct IDF curves is still largely unexplored. Additionally, uncertainty quantification in ML-based IDF predictions is often overlooked, despite its importance for engineering decisions. Addressing these gaps can enable adaptive, data-driven IDF curves that support resilient infrastructure, even in ungauged regions. Therefore, by leveraging satellite-based precipitation data to train ML and DL models in constructing IDF curves, this approach aims to yield curves that more accurately capture precipitation patterns and better account for uncertainty. This study applies rigorous preprocessing and temporal disaggregation on remotely sensed data obtained from the GPM (Global Precipitation Measurement) mission to derive high-resolution rainfall intensities. Advanced ML and DL models are then trained to estimate and construct IDF relationships—including Support Vector Regression (SVR), Artificial Neural Networks (ANN), TCNs, and the novel Temporal Convolutional Attention Network (TCAN)—after which their outputs are benchmarked against the Gumbel distribution using Nash-Sutcliffe Efficiency (NSE) (Nash and Sutcliffe, 1970), Squared Pearson (R²), Root Mean-Squared Error (RMSE) and Mean-Absolute Error (MAE). These findings are then further situated within the broader context of existing literature, highlighting their implications for urban infrastructure planning and demonstrating the potential of advanced deep learning approaches to outperform traditional statistical methods. This integrated framework demonstrates the potential of remote sensing and AI to enhance hydrological modeling in data-scarce, developing environments.

This paper is structured as follows: Section 2 outlines the methodology employed for the acquisition, preprocessing, training and testing of satellite-based precipitation data. Section 3 presents a comprehensive comparative analysis of these models and the results discovered, with a particular focus on their performance in accurately constructing IDF relationships. Section 4 discusses the findings in broader concepts. Finally, Section 5 concludes the study and wraps up the study, demonstrating ML and DL based approaches as potentially useful for IDF curve construction.

2 Materials and methods

2.1 Study area

This study observes precipitation over Beirut, the capital city of Lebanon situated at an approximate center coordinate of 33.89°N latitude and 35.50°E longitude, and its metropolitan area (Figure 1) on the eastern coast of the Mediterranean Sea sitting 95 meters above sea level at its highest point and encompassing an area of around 67 km² (Yassin, 2012). Beirut International Airport (33.83°N, 35.50°E) serves as the location for both the gauge station and the target point for satellite precipitation measurement.

Figure 1

Satellite image showing Beirut City and Beirut International Airport. Beirut City is marked with a red pin at longitude 35.4955 and latitude 33.8886. The airport is marked with a green pin at longitude 35.4955 and latitude 33.813. The coast and land features are visible with a scale bar indicating two kilometers.

Figure 1. Landsat 8 image of Beirut, Lebanon and its surrounding area, derived from median pixel values collected between August 1 and August 31, 2017 using the Google Earth Engine Platform, Highlighting Beirut City (35.4955, 33.8886) and the station location in Beirut International Airport (35.4955, 33.83). Grid lines are displayed at a 0.1-degree spacing (U.S. Geological Survey, 2018; Gorelick et al., 2017).

Beirut’s climate is typical to that of any eastern coastal Mediterranean area characterized by hot, dry summers and mild wet winters, with the average annual precipitation ranges around 850mm per year (Andraos and Najem, 2020). The rainfall pattern is highly seasonal where most of the precipitation the city receives is split between December and January and virtually no precipitation is observed in the months between July and August (Kassem and Gökçekuş, 2020).

2.2 Methodology

This study employs remotely sensed rainfall data to develop IDF curves for return periods of 2, 5, 10, 25, 50, and 100 years. The analysis uses both the Gumbel distribution and advanced ML approaches (SVMs, ANNs, TCNs, and TCANs) for IDF curve construction. To achieve this, high-resolution precipitation data are downloaded from the GPM-IMERG satellite product, which are then processed to extract the extreme rainfall events with the highest intensity for each year in each duration. Once collected, the raw satellite precipitation data undergo rigorous preprocessing and temporal disaggregation. This step ensures temporal consistency and enables the reconstruction of rainfall intensities at various sub-daily durations not natively available in the IMERG product, thus permitting downstream analyses at critical resolutions for IDF construction. Figure 2 goes over all the major steps taken to generate these curves from the collected data and the selected methods.

Figure 2

Flowchart illustrating the process of data collection and analysis for satellite data validation. It starts with collecting station data from Beirut International Airport and satellite data from GPM-IMERG. It proceeds to validate satellite data with station data, followed by preprocessing and disaggregation. Two methods diverge: statistical method involving fitting distribution, location, scale, and constructing IDF curves, and ML/DL method involving hyperparameter optimization, model training, validation, and intensity prediction. Both methods compute validation metrics, followed by comparison, validation, and uncertainty analysis.

Figure 2. Methodology overview.

From this processed satellite-derived dataset, two distinct streams are followed: an initial branch applies the Gumbel distribution to the disaggregated precipitation data and generates baseline IDF curves. The derived IDF relationships, are benchmarked against Gumbel-based curves constructed using available ground-measured rainfall extremes over a held-out portion of the dataset using NSE, R2, RMSE and MAE, thereby quantifying the representativeness and reliability of the satellite-based approach. In the other branch, the derived satellite intensity-duration data is fed into a suite of ML/DL models—including SVR, ANN, TCN, and TCAN models. Here, the rainfall event duration serves as the input variable, and the corresponding maximum intensity as the prediction target. These ML/DL models are trained on duration inputs to predict as target the satellite derived intensities, with architectures and hyperparameters optimized for capturing the inherent nonlinearities and temporal dependencies characteristic of hydrological extremes. For each duration, the trained ML/DL models then infer the intensity, from which the return period is subsequently computed using the Weibull plotting position formula. The resulting intensity estimates for various return periods are validated on the same held-out portion of the dataset of the statistical method and using the same metrics to ensure proper results.

2.3 Data collection

Daily 24-h and sub-daily 30-min precipitation data from January 1998 to February 2025 for Beirut, Lebanon are collected from the GPM-IMERG Version 07 Final Run (It. 07) product using Google Earth Engine and NASA Giovanni platforms (Huffman et al., 2023a, b). The GPM-IMERG It. 07 represents the latest research-quality gridded global multi-satellite precipitation estimates with comprehensive algorithmic improvements implemented in response to user feedback on Version 06 incorporating wide-ranging enhancements such as improved intercalibration processes, corrected spatial gridding to fix geolocation offsets, upgraded infrared precipitation retrieval schemes, and the inclusion of passive microwave estimates over frozen surfaces for near-complete global coverage. To validate the reliability of the GPM-IMERG data, precipitation samples are extracted and compared with corresponding measurements from the ground-based station in Beirut International Airport, whose recordings with the station readings and satellite readings overlapping with little error. In addition, measured IDF curves from ground stations in Beirut, Lebanon are also extracted from government sources, which are used to validate statistical and ML/DL methods and serve as a reliable baseline for model performance (Camp Dresser, McKee Inc., Khatib and Alami, 1982).

2.4 Data preprocessing

The 30-min and 24-h precipitation data collected from GPM-IMERG are converted to intensities, from which the intensity values are for 5, 10, 15 min, 1 h, 90 min, 3, 6, 12, 15, and 18 h are disaggregated using the Indian Meteorological Department Formula (IMF), widely used across hydrology literature in Lebanon (Obeid and Elkholy, 2021). The IMF is defined in Equation 1 (Ramaseshan, 1996):

\begin{array}{l} P_{t} = P_{T} \times {(\frac{t}{T})}^{\frac{1}{3}} & (1) \end{array}

Where:

• $P_{t}$ is the intensity value for the time period $t$ in mm/h.

• $P_{T}$ is the intensity value for the time period $T$ in mm/h.

• $t$ is the time period for which the precipitation value is derived in minutes.

• $T$ is the time period for which the precipitation value is given in minutes.

• The exponent $\frac{1}{3}$ is a parameter that has been tuned to better accommodate for precipitation expectations in Beirut, Lebanon.

These converted and processed intensity time-series are then used to extract the maximum intensities for each duration in the dataset. Subsequently, these maximum intensities are used to construct the IDF curves using the Gumbel distribution and train the machine learning models. This disaggregated data provides estimations.

2.5 Construction and training

2.5.1 Statistical method

The IDF curves are first constructed using the Gumbel Distribution fitted to the extreme intensities using the maximum likelihood estimation (MLE) method for each duration to determine the location ( $α$ ) and scale ( $β$ ) parameters, whose derived values are shown in Table 1. The Gumbel distribution (Gumbel, 1935) and its inverse cumulative function are defined in Equations 2, 3 respectively:

\begin{array}{l} F (x; α, β) = e^{- e^{\frac{x - α}{β}}} & (2) \end{array}

\begin{array}{l} Q (p) = α - βln (- ln (p)) s . t . p \in (0, 1) & (3) \end{array}

Table 1

Table 1. Gumbel’s location and scale parameters.

Where:

• $x$ is the input intensity value in mm/h.

• $p$ is the probability value in the interval (0, 1).

Using these parameters, the return period intensities are calculated through the Gumbel inverse cumulative distribution function, resulting in IDF curves that are rigorously constructed from satellite-based data and serve both as an independent predictive model and as the statistical benchmark for comparison with the ML models.

2.5.2 Machine learning and deep learning models

To train the ML and DL models, the duration of the event is taken as input, and the extreme maximum intensity of the duration in the dataset as output. Additionally, 50% of the data is allocated for fitting the models to the data and 50% for validating the models to ensure the models properly generalize using RMSE, MAE, R² and NSE. After the training process is completed, the model is used to predict intensities for each duration. These intensities are then used to compute the return period intensities for the IDF curves using the Weibull plotting position formula (Weibull, 1951), defined in Equation 4:

\begin{array}{l} p = \frac{i}{n + 1} & (4) \end{array}

Where:

• $p$ is the derived plotting position.

• $i$ is the rank of the data point in ascending order.

• $n$ is the total number of data points.

This nonparametric approach enables robust estimation of rainfall extremes even in the presence of uncertainty, non-stationarity or distributional ambiguities. More details on the training algorithms and optimization strategies are included for each model in the following sections.

2.5.2.1 Support vector machines

The SVM model is trained to predict the log-transformed maximum rainfall intensity from the log-transformed event duration, using the SVR approach, i.e., an SVM model for Regression tasks. The SVR model finds an optimal hyperplane by solving the following optimization problem defined in Equation 5 (Drucker et al., 1996):

\begin{array}{l} min_{w, b} (\frac{1}{2} {‖ w ‖}^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*}) & (5) \end{array}

Subject to:

y_{i} - (w^{T} ϕ (x_{i}) + b) \leq ϵ + ξ_{i} (w^{T} ϕ (x_{i}) + b) - y_{i} \leq ϵ + ξ_{i}^{*}, ξ_{i}, ξ_{i}^{*} \geq 0

Each prediction is estimated by the following function defined in Equation 6:

\begin{array}{l} f (x) = \sum_{i = 1}^{n} (α_{i} - α_{i}^{*}) K (x_{i}, x) + b & (6) \end{array}

Where:

• $w$ is the weight vector.

• $b$ is the bias term.

• $ϵ$ is the tolerance parameter.

• $C$ is the regularization parameter controlling the trade-off between fitting error and model complexity.

• $ξ_{i}, ξ_{i}^{*}$ are slack variables allowing violations of the margin.

• $K (x_{i}, x)$ is the kernel function (Gaussian, Polynomial…).

• $α_{i}, α_{i}^{*}$ are Lagrange multipliers (support vector coefficients).

The SVM model is selected for its ability to capture potential non-linear relationships between rainfall intensity and event duration, making it a candidate for representing complex hydrological processes. For training, the input data is scaled using min-max scaling, and a grid search with 5-fold cross-validation is employed to optimize the SVR hyperparameters. The best-performing scenario is then fitted again to the training set, and its predictions were inverse-transformed and exponentiated to recover the original intensity scale. Finally, after validation and testing, IDF curves are generated for various return periods by scaling the base SVM predictions with frequency factors derived from the Weibull plotting position formula.

2.5.2.2 Artificial neural network

The ANN model is trained to predict the log-transformed extreme rainfall intensity using the log-transformed event duration as input. ANNs make a strong option for this task due to their proven ability to generalize any given mathematical function via the UAT which shows that a feedforward neural network with at least one hidden layer and enough neurons can approximate any continuous function on a compact domain (Hornik et al., 1989). ANN models learn by making an initial prediction, called the forward pass, compute a differentiable loss function by comparing it with the actual targets, then through gradient descent updates the weights of the model. The general algorithm is as follows in Equations 7–10 (Rumelhart et al., 1986; Goodfellow et al., 2016):

For each layer $l$ in a network with $L$ layers, the forward pass computes:

\begin{array}{l} z^{[l]} = W^{[l]} a^{[l - 1]} + b^{[l]} & (7) \end{array}

\begin{array}{l} a^{[l]} = σ (z^{[l]}) & (8) \end{array}

\begin{array}{l} \hat{y} = a^{[L]} & (9) \end{array}

Where:

• $z^{[l]}$ is the pre-activation linear regression.

• $W^{[l]}$ is the weight matrix of layer l.

• $a^{[l]}$ is the post-activation output of layer $l$ .

• $b^{[l]}$ is the bias vector of layer $l$ .

• $σ (\cdot)$ is the activation function (Rectified Linear Unit (ReLU), sigmoid, tanh…)

• $a = X$ in the first layer.

• $\hat{y}$ is the final predicted output.

Let the loss $ℒ$ be any differentiable equation that computes the difference between $y$ and $\hat{y}$ . The backpropagation algorithm computes for each layer $l$ in the network with $L$ layers:

\begin{array}{l} W^{[l]} = W^{[l]} - η \frac{\partial ℒ}{\partial W^{[l]}} & (10) \end{array}

Where $η$ is the learning rate.

Model training is performed using specialized optimizers, minimizing mean squared error loss. After training, predictions are inverse-transformed and exponentiated to recover the original intensity scale. Model performance is evaluated across all return periods.

2.5.2.3 Temporal convolutional network

The TCN model is implemented to predict the log-transformed extreme rainfall intensity using sequences of duration as input features. TCNs leverage dilated convolutions and residual connections to efficiently capture both local and long-range dependencies in the duration-intensity relationship, making them essential for modeling the complex temporal structure of rainfall events. Their lightweight architecture enables high accuracy with minimal parameters, making them a natural choice for IDF curve construction. The fundamental operations performed by the TCN is the dilated convolution, defined in Equation 11 (Bai et al., 2018):

\begin{array}{l} F_{TCN} (x_{t}) = \sum_{i = 0}^{k - 1} f (i) \cdot x_{t - d \cdot i} & (11) \end{array}

And the residual block in Equation 12:

\begin{array}{l} F_{block} (x) = ReLU (Conv (x, d, k) + Dropout) + x & (12) \end{array}

Where:

• $x_{t}$ is the input at timestep $t$ .

• $f (i)$ is the convolution filter of size $k$ .

• $d$ is the dilation factor which increases exponentially with layer depth $d_{l} = 2^{l - 1}$ .

• The causality ensures predictions at time $t$ depend only on $t$ and earlier timesteps.

• The dropout is a regularization factor, the rate at which units in the layer are dropped.

For training, to ensure the data is properly passed to the TCN, each input sample is formatted as a sequence of length 5, with features scaled with min-max scaling and a proper loss that implements gradient clipping to improve stability. The model architecture consists of two dilated 1-Dimensional (1D) convolutional layers with residual connections, followed by global average pooling and a linear output layer, resulting in a highly parameter-efficient design.

2.5.2.4 Temporal convolutional attention network

The TCAN model extends the TCN by incorporating a lightweight multi-head self-attention mechanism after the dilated convolutional layers. After the temporal features $H$ are extracted via TCN, the model then computes the self-attention as defined in Equations 13, 14 (Vaswani et al., 2017; Lin et al., 2021):

\begin{array}{l} {head}_{i} = Attention (Q_{i}, K_{i}, V_{i}) = softmax (\frac{Q_{i}, K_{i}^{T}}{\sqrt{d_{k}}}) V_{i} & (13) \end{array}

For $i = 1, 2, 3, \dots, h$ heads:

\begin{array}{l} MultiHead (H) = Concat ({head}_{1}, head, head, \dots, {head}_{h}) W_{O} & (14) \end{array}

Where:

• $H \in ℝ^{T \times d}$ , i.e., $H$ is the sequence of hidden representations, with $T$ timesteps and $d$ hidden features.

• $Q = H W_{Q}$ , $K = H W_{K}$ , $V = H W_{V}$ and $W_{Q}, W_{K}, W_{V} \in ℝ^{d \times d}$ are learnable weight matrices.

• $Q, K, V \in ℝ^{T \times d_{k}}$ , such that $Q$ is the Query, $K$ is the Key and $V$ is the Value vectors.

• $Concat (\cdot) \in ℝ^{T \times (h \cdot d_{k})}$ ,

• $W_{O} \in ℝ^{h \cdot d_{k} \times d}$ is another learnable matrix that projects back to the original dimension.

This addition allows the network to better capture long-range dependencies and interactions between different temporal positions in the input sequence, which makes TCANs particularly effective for generating accurate IDF curves, especially in scenarios where capturing subtle temporal dependencies is critical. The input preparation, scaling procedures and training procedures mirror those used for the TCN with slight adjustments to the hyperparameters.

3 Results

The satellite based IDF curves generated using the statistical method and the ML models are validated against the held-out 50% split of the data using RMSE, MAE, R², and NSE to measure their raw performance and improvement. The scores achieved by each method on these metrics will serve as the basis for the comparison. Table 2 shows the metrics score for all methods and are discussed in detail in the following subsections, while Figure 3 showcases the IDF curves as constructed by the statistical method and Figure 4 shows the curves as constructed by the DL models.

Table 2

Table 2. Performance metrics of statistical, machine learning and deep learning models.

Figure 3

Two graphs display Intensity-Duration-Frequency (IDF) curves for different return periods ranging from 2 to 100 years. Graph (A) uses the Statistical Method, showing rainfall intensity decreasing with longer durations; key metrics include RMSE 4.7947 and NSE 0.5960. Graph (B) employs SVM, also illustrating decreasing intensity; metrics are RMSE 10.3266 and NSE 0.8995. Both graphs have similarly colored curves representing return periods and labeled axes for duration in minutes and intensity in mm/h.

Figure 3. IDF curves generated using the statistical method (A) and the SVM machine learning model (B) over the 2, 5, 10, 25, 50, and 100-year return periods.

Figure 4

Three graphs display Intensity-Duration-Frequency (IDF) curves for different return periods. Graph A shows results using ANN, with metrics: MAE 4.4415, RMSE 8.7456, R-squared 0.9391, NSE 0.9279. Graphs B and C use TCN, with slightly varying metrics, demonstrating how intensity decreases over time for various return periods (2, 5, 10, 25, 50, 100 years). The graphs compare model performance for rainfall prediction.

Figure 4. IDF curves generated by the deep learning approaches: ANN (A), TCN (B), and TCAN (C) over the 2, 5, 10, 25, 50, and 100-year return periods. The DL models show superior performance compared to the other methods.

To better understand the metrics, RMSE is an error metric sensitive to large deviations due to its squaring operation, placing higher emphasis on large prediction errors and outliers. This property is particularly important in IDF curve construction, where capturing extreme rainfall events is crucial for modeling precipitation extremes. On the other hand, MAE provides an average magnitude of error that treats all deviations equally, making it robust against outliers. It is well-suited for evaluating how models perform with typical rainfall values, ensuring the analysis is not dominated by extremes or irregular points. Both the R² metric and the NSE quantify how well model predictions reproduce observed data. R² is widely used in machine learning for regression tasks, while NSE is the standard in hydrology. Despite their disciplinary preferences, the two formulations are mathematically identical. Therefore, this paper uses R² to refer to the Squared Pearson metric, another correlation coefficient representing the proportion of variance in observed values explained by the model.

3.1 Statistical method results

The IDF curves generated from satellite data and the Gumbel distribution is found to be a decent fit on the validation set, with the results showcasing acceptable performance, exhibiting decent NSE and R² alongside lower RMSE and MAE values, indicating that the satellite data-based curves can reliably indicate rainfall in the region. Quantified, the IDF curves generated by using satellite data against the validation set achieve R² = 0.623, NSE = 0.596, RMSE = 4.795 mm/h and MAE = 3.881 mm/h. Constructing these curves from satellite-derived data demonstrates their reliability in capturing extreme hydrologic patterns across multiple durations, supporting robust flood risk assessment and infrastructure planning. This underscores the value of high-resolution satellite products for generating reference IDF relationships, validating ground observations, and expanding the spatial coverage of essential design data in water-scarce regions.

3.2 Machine learning and deep learning models results

3.2.1 SVM results

The SVM model captures the essence of IDF curves solidly and shows great skill and fitness in its ability to model the variance and general IDF patterns, as shown by the high figures in R² and NSE, yet the model is compromised when it comes to pointwise errors. Model evaluation metrics reflect this trend, with an R² = 0.9, NSE = 0.899, RMSE = 10.326 mm/h, and MAE = 4.14 mm/h. This shows that SVMs struggle in fitting extreme values, despite showing great performance in regular and moderate scenarios while taking advantage of the kernel trick, a property of SVM kernels that allows data to be implicitly projected into higher-dimensional spaces for pattern recognition. The performance of the SVM curves is visualized in Figure 5A.

Figure 5

Four scatter plots labeled A, B, C, and D show predicted versus observed intensity (mm/hr) with a 1:1 reference line. Plot A (SVM) has RMSE 10.3266, MAE 4.1401, R² 0.9001, NSE 0.8995. Plot B (ANN) shows RMSE 8.7450, MAE 4.4415, R² 0.9191, NSE 0.9279. Plot C (TCN) indicates RMSE 6.8703, MAE 4.0638, R² 0.9577, NSE 0.9555. Plot D (TCAN) reveals RMSE 6.6094, MAE 3.6144, R² 0.9612, NSE 0.9588. All graphs illustrate close alignment with the reference line.

Figure 5. The ML and DL models’ predictions compared to the rainfall observations in the hold-out validation set sorted by ascending order of performance: SVM (A), ANN (B), TCN (C), and TCAN (D).

3.2.2 ANN results

The ANN model captures the essence of IDF curves solidly as it reliably estimates extreme rainfall intensities across various durations. Quantitatively, the ANN curves produce an R² = 0.929, NSE = 0.927, RMSE = 8.745 mm/h and MAE = 4.442 mm/h. With the R² and NSE soaring above the 0.90 mark, the ANN model is better able to capture and represent the overall trend. These results show ANNs as decent predictor models in fitting both regular and extreme values, due to their multi-layered architecture which allows for the flexible mapping of nonlinear relationships and robust representation of complex rainfall patterns. The overall performance of the ANN constructed curves is shown in Figure 5B.

3.2.3 TCN results

The TCN model demonstrates high accuracy in fitting extreme rainfall intensities throughout all durations where the generated IDF curves from the TCN perform consistently across the validation range and accurately capture the trend and variance for the IDF curves to represent the dataset. According to the evaluation criteria, the TCN achieves an R² = 0.958, NSE = 0.956, RMSE = 6.870 mm/h and MAE = 4.069 mm/h, confirming that temporal convolutional models can provide high-resolution and robust IDF predictions, that better describe the trend in the underlying dataset while maintaining computational efficiency (Figure 5C). These findings demonstrate the advantages of temporal convolutional architectures in capturing sequential rainfall dependencies and producing robust IDF curve predictions while preserving computational efficiency.

3.2.4 TCAN results

The TCAN model produces well-constructed IDF curves that accurately reflect the expected trends across rainfall durations, showing very consistent performance that perfectly captures the trend and variance expected for IDF curves, ensuring reliability throughout the validation range. Performance metrics demonstrate the model’s effectiveness across the board with an R² = 0.961, NSE = 0.959, RMSE = 6.609 mm/h, and MAE = 3.841 mm/h, ranking the TCAN just above the TCN in global accuracy (Figure 5D). The integration of attention mechanisms enhances the TCAN’s ability to learn complex temporal relationships and provides improved interpretability, allowing it to better describe the intrinsic data it was trained on. This positions the TCAN as a particularly robust and versatile approach for future IDF curve construction tasks.

4 Discussion

The statistical method IDF curves display strong fidelity to the left-out dataset used for validation. This approach achieves satisfactory R² and NSE scores, indicating a reliable fit to observed data. This demonstrates that applying statistical methods to satellite-derived datasets for IDF curve generation is a viable approach, especially given the failure of measured curves to reflect trends during the study period, as shown by their negative NSE. Among tested models, temporal convolutional architectures (TCAN and TCN) deliver the best performance for IDF curve construction, with TCAN achieving the highest R², NSE, and lowest MAE, confirming its superior ability to capture temporal patterns. TCN follows closely, reinforcing the advantage of deep learning methods over statistical approaches. ANN also provides strong, reliable results, though its practical use may be limited by tuning requirements. SVM offers lower accuracy but remains useful for baseline comparisons. Overall, these findings support the prioritization of TCAN for achieving the highest correspondence with left out samples and improved interpretability, while TCN remains a strong second alternative. This ordering also considers the fact that while the metrics are close, TCANs leverage the attention mechanism to converge faster and offer a extended receptive field without the need for an architectural depth. Following this, the ANN offers robust, efficient modeling in scenarios where rapid optimization and ease of deployment are essential, and SVM should be retained mainly for preliminary exploration and baseline comparisons.

To further validate the results, five historical extreme events are analyzed to first measure how much the statistical method improves upon measured IDF curves, and then how much ML-based curves improve over the statistical method. In each case, at least one ML model-based IDF curve set surpasses the curves from the statistical method by more than 10%, as shown in Table 3. TCAN IDF curves emerge most frequently as one of the top performers with the closest predictions, with TCN curves ranking closely behind while less frequent often give better predictions. The ANN curves also emerged as a contender, notably on the longer durations. Across the specified events, ML-based curves improved by 61.24% over the IDF curves generated by the statistical method. Furthermore, the curves of the statistical method give a better outlook on the historical data than the measured IDF curves, yielding an improvement of 84.7%, showcasing the reliability of the generated curves as the improvement benchmark for this experiment.

Table 3

Table 3. Extreme events showcasing the performance of ML-based IDF curves compared to statistical methods.

An uncertainty analysis was also conducted with results outlined in Table 4, revealing that all ML-based models consistently outperform the Gumbel distribution in terms of predictive confidence and reliability, though with marked differences in degree. The ML-based model progression demonstrates a clear architectural advantage hierarchy: as models advance from SVM to TCAN, epistemic uncertainty systematically decreases, with TCAN reducing model knowledge gaps by 81.7% relative to Gumbel, followed by TCN (81.1%), ANN (76.0%), and SVM (71.5%). Similarly, total uncertainty—combining inherent data noise and model uncertainty—shows a dramatic monotonic improvement across the spectrum, with TCAN achieving 62.2% reduction, TCN 60.8%, ANN 50.3%, and SVM 41.0% relative to Gumbel’s baseline, underscoring that deeper architectural sophistication translates to greater prediction confidence. Prediction interval precision follows a comparable pattern, with TCAN and TCN providing the narrowest bounds (25.89 and 26.82 units respectively), affording engineers tighter design tolerances compared to Gumbel’s wider intervals (28.82 units). Crucially, TCAN is the only model to achieve calibration excellence, exceeding the 95% coverage target at 95.6%, while TCN, ANN, and SVM achieve 93.4% improving over Gumbel at 92.3%—a critical distinction for trustworthy design applications where under-estimated uncertainty poses operational risk. Finally, consistency across different conditions is highest for TCAN and TCN (Coefficient of Variation (CV) of 0.259 and 0.268 respectively), moderate for Gumbel (0.288), and notably lower for ANN and SVM (0.340 and 0.404), indicating that temporal convolutional architectures maintain stable predictive quality across the full IDF curve spectrum, whereas kernel-based and fully connected approaches show performance variability.

Table 4

Table 4. Uncertainty analysis results showcasing the clear difference between the statistical methods and the ML-based methods.

Collectively, these findings establish that adaptive AI architectures—particularly those with temporal convolution and attention mechanisms—not only enhance conventional accuracy metrics like R² and NSE, but also systematically reduce model knowledge gaps, narrow actionable prediction bounds, ensure well-calibrated confidence estimates, and maintain robust performance across all precipitation intensity-duration-return period combinations, validating their use as the preferred approach for generating reliable IDF curves and supporting risk-aware infrastructure design. Overall, these results highlight the superiority of AI-based IDF modeling for hydrological design and risk management. By avoiding rigid distributional assumptions, ML models provide more statistically robust and climatically responsive design storms. For engineers, this translates to infrastructure more closely tailored to local rainfall dynamics, with direct benefits in cost efficiency, safety, and resilience against rare extremes. Their adaptability across different durations and return periods further supports application in diverse hydrological settings, from flash-flood-prone catchments to urban pluvial hotspots.

Nevertheless, the complexity of deep learning models introduces challenges. Computational requirements, and calibration demands may hinder operational uptake (Wang, 2024). Furthermore, despite ML-based approaches showing remarkable ability to generalize and explain any structured dataset, the use of disaggregated data from daily records presents inherent limitations. These disaggregated datasets are rough estimates that may still differ from real-world observations due to uncertainties in the disaggregation process and the resolution of the original data. However, they are employed primarily where precipitation datasets may be insufficient or unavailable to explore ML models’ ability to construct IDF curves, serving as a promising alternative to traditional statistical methods. Moreover, this study is limited to Beirut and does not include other regions in Lebanon and only uses precipitation reading from on GPM-IMERG, making it sensitive to error within the IMERG retrievals. Future research should therefore incorporate explainable AI techniques to further reveal what inherent properties ML models leverage to generate better IDF curves, intelligently sample readings from multiple satellite sources to reduce sensitivity, implement more advanced and robust preprocessing and disaggregation methods, and integrate these models into real-time updating frameworks and climate projection ensembles, enabling their use in operational flood forecasting and long-term adaptation planning.

5 Conclusion

Hydrological infrastructure planning, urban design and flood risk analysis have always relied on IDF curves, commonly supported by classical statistical models. However, the several issues stem from the structural limitations of traditional IDF curves: their dependence on rigid assumptions and parameters not only fails to reflect evolving rainfall patterns but also amplifies uncertainty, making these models less reliable for predicting present and future extremes. With climate variability introducing increased uncertainty and complexity in precipitation patterns, the limitations of traditional methods have become more apparent. This study addresses these challenges by adopting a modern framework that combines high-resolution satellite rainfall data from GPM-IMERG with cutting-edge ML and DL models to construct IDF curves—including SVM, ANN, and especially TCN and TCAN, a novel attention-based variant.

By rigorously processing precipitation data and benchmarking performance, the study reveals that DL models, particularly TCN and TCAN, significantly outperform traditional methods like the Gumbel distribution. TCAN proved especially accurate, achieving an NSE score of 0.959, effectively capturing complex rainfall structures and extreme events. Historical validation using 28 years of data substantiated these findings, with DL models reducing prediction errors for extreme events by at least 61.24%, and up to 80.0% as well as significantly reducing uncertainty by more than 41.0% and up to 62.2%.

These results have profound implications for urban water infrastructure and climate resilience. The integration of satellite remote sensing with AI empowers more precise and adaptive IDF curve development, crucial for optimizing drainage design and managing flood risks—especially in data-scarce regions like Lebanon, while the use of ML-based methods to directly estimate and construct the IDF curves will yield wider knowledge and understanding of the variance, narrower prediction bounds and less uncertainty in the predictions.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/supplementary material.

Author contributions

ED: Data curation, Investigation, Software, Writing – original draft. CA: Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Supervision, Validation, Visualization, Writing – review & editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was used in the creation of this manuscript. GPT was used only for helping in the writing.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Acar, R., Çelik, S., and Şenocak, S. (2008). Rainfall intensity-duration-frequency (IDF) model using an artificial neural network approach. J. Sci. Ind. Res. 67.

Google Scholar

Ameen, S. M., Aziz, S. Q., Dawood, A. H., Sabir, A. T., and Hawez, D. M. (2025). Utilizing machine learning and deep learning for precise intensity-duration-frequency (IDF) curve predictions. Polytech. J. 15, 27–38. doi: 10.59341/2707-7799.1848

Crossref Full Text | Google Scholar

Andraos, C., and Najem, W. (2020). “Multi-model approach for reducing uncertainties in rainfall-runoff models” in Advances in Hydroinformatics. eds. P. Gourbesville and G. Caignaert (Singapore: Springer Singapore), 545–557.

Google Scholar

Bai, S., Kolter, J. Z., and Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling

Google Scholar

Basumatary, V., and Sil, B. S. (2016). Generation of rainfall intensity-duration-frequency curves for the Barak River basin. Environ. Res. doi: 10.26491/mhwm/79175

Crossref Full Text | Google Scholar

Bell, F. C. (1969). Generalized rainfall—duration—frequency relationships. J. Hydraul. Div. 95, 311–328. doi: 10.1061/JYCEAJ.0001942

Crossref Full Text | Google Scholar

Burian, S. J., Durrans, S. R., Tomić, S., Pimmel, R. L., and Wai, C. N. (2000). Rainfall disaggregation using artificial neural networks. J. Hydrol. Eng. 5, 299–307. doi: 10.1061/(ASCE)1084-0699(2000)5:3(299)

Crossref Full Text | Google Scholar

Camp Dresser, McKee Inc., Khatib and Alami. 1982. Master plan for Stormwater drainage (no. 3; National Waste Management Plan, p. 3.C – 18). Lebanese Council for Development and Reconstruction, United Nations development program (UNDP), World Health Organization (WHO).. Available online at: http://www.studies.gov.lb/Sectors/Infrastructure-and-Resources/1982/WAT-h82-3

Google Scholar

Cheng, L., and AghaKouchak, A. (2014). Nonstationary precipitation intensity-duration-frequency curves for infrastructure design in a changing climate. Sci. Rep. 4:7093. doi: 10.1038/srep07093,

PubMed Abstract | Crossref Full Text | Google Scholar

Collalti, D., Spencer, N., and Strobl, E. (2024). Flash flood detection via copula-based intensity–duration–frequency curves: evidence from Jamaica. Nat. Hazards Earth Syst. Sci. 24, 873–890. doi: 10.5194/nhess-24-873-2024

Crossref Full Text | Google Scholar

Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A., and Vapnik, V. (1996). “Support vector regression machines” in Advances in neural information processing systems. eds. M. C. Mozer, M. Jordan, and T. Petsche, vol. 9 (MIT Press).

Google Scholar

Gebrechorkos, S. H., Leyland, J., Dadson, S. J., Cohen, S., Slater, L., Wortmann, M., et al. (2024). Global-scale evaluation of precipitation datasets for hydrological modelling. Hydrol. Earth Syst. Sci. 28, 3099–3118. doi: 10.5194/hess-28-3099-2024

Crossref Full Text | Google Scholar

Goan, E., and Fookes, C. (2020). “Bayesian neural networks: an introduction and survey” in Case studies in applied Bayesian data science (Springer International Publishing), 45–87.

Google Scholar

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning : MIT Press, 194–198.

Google Scholar

Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., and Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment. 202, 18–27. doi: 10.1016/j.rse.2017.06.031

Crossref Full Text | Google Scholar

Govindaraju, R. S. (2000). Artificial neural networks in hydrology. I: preliminary concepts. J. Hydrol. Eng. 5, 115–123. doi: 10.1061/(ASCE)1084-0699(2000)5:2(115)

Crossref Full Text | Google Scholar

Gruss, Ł., Willems, P., Tomczyk, P., Pollert, J., Pollert, J., Märtner, C., et al. (2020). The application of new distribution in determining extreme hydrologic events such as floods. Hydrol. Earth Syst. Sci. Discuss. 2020, 1–31. doi: 10.5194/hess-2020-173

Crossref Full Text | Google Scholar

Gumbel, E. J. (1935). Les valeurs extrêmes des distributions statistiques. Ann. Inst. Henri Poincare 5, 115–158.

Google Scholar

Hornik, K., Stinchcombe, M., and White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366. doi: 10.1016/0893-6080(89)90020-8

Crossref Full Text | Google Scholar

Hu, H., and Ayyub, B. M. (2019). Machine learning for projecting extreme precipitation intensity for short durations in a changing climate. Geosciences 9. doi: 10.3390/geosciences9050209

Crossref Full Text | Google Scholar

Huffman, G. J., Stocker, E. F., Bolvin, D. T., Nelkin, E. J., and Tan, J. (2023a). GPM IMERG final precipitation L3 half hourly 0.1 degree x 0.1 degree V07. Goddard earth sciences data and information services center (GES DISC)

Google Scholar

Huffman, G. J., Stocker, E. F., Bolvin, D. T., Nelkin, E. J., and Tan, J. (2023b). GPM IMERG final precipitation L3 day 0.1 degree x 0.1 degree V07. Goddard earth sciences data and information services center (GES DISC)

Google Scholar

Kassem, Y., and Gökçekuş, H. (2020). Water resources and rainfall distribution function: a case study in Lebanon. Desalin. Water Treat. 177, 306–321. doi: 10.5004/dwt.2020.24811

Crossref Full Text | Google Scholar

Lau, A., and Behrangi, A. (2022). Understanding intensity–duration–frequency (IDF) curves using IMERG sub-hourly precipitation against dense gauge networks. Remote Sens 14. doi: 10.3390/rs14195032

Crossref Full Text | Google Scholar

Lin, Y., Koprinska, I., and Rana, M. (2021). Temporal convolutional attention neural networks for time series forecasting,. In Proceedings of the 2021 international joint conference on neural networks (IJCNN) 1–8

Google Scholar

Marra, F., Koukoula, M., Canale, A., and Peleg, N. (2024). Predicting extreme sub-hourly precipitation intensification based on temperature shifts. Hydrol. Earth Syst. Sci. 28, 375–389. doi: 10.5194/hess-28-375-2024

Crossref Full Text | Google Scholar

Martel, J. L., Brissette, F. P., Lucas-Picher, P., Troin, M., and Arsenault, R. (2021). Climate change and rainfall intensity–duration–frequency curves: overview of science and guidelines for adaptation. J. Hydrol. Eng. 26:03121001. doi: 10.1061/(ASCE)HE.1943-5584.0002122

Crossref Full Text | Google Scholar

Milojevic, T., Blanchet, J., and Lehning, M. (2023). Determining return levels of extreme daily precipitation, reservoir inflow, and dry spells. Front. Water 5:1141786. doi: 10.3389/frwa.2023.1141786

Crossref Full Text | Google Scholar

Nash, J. E., and Sutcliffe, J. V. (1970). River flow forecasting through conceptual models part I — a discussion of principles. J. Hydrol. 10, 282–290. doi: 10.1016/0022-1694(70)90255-6

Crossref Full Text | Google Scholar

Nguyen, P., Ombadi, M., Sorooshian, S., Hsu, K., AghaKouchak, A., Braithwaite, D., et al. (2018). The PERSIANN family of global satellite precipitation data: a review and evaluation of products. Hydrol. Earth Syst. Sci. 22, 5801–5816. doi: 10.5194/hess-22-5801-2018

Crossref Full Text | Google Scholar

Obeid, H., and Elkholy, M. (2021). Generalization of intensity-duration-frequency formula for Litani River basin-Lebanon. Int. J. Appl. Sci. 4, 11–25. doi: 10.30560/ijas.v4n1p11

Crossref Full Text | Google Scholar

Ramaseshan, S. (1996). Urban hydrology in different climatic conditions : Regional Engineering College.

Google Scholar

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by back-propagating errors. Nature 323, 533–536. doi: 10.1038/323533a0

Crossref Full Text | Google Scholar

Shehu, B., and Haberlandt, U. (2023). Uncertainty estimation of regionalised depth–duration–frequency curves in Germany. Hydrol. Earth Syst. Sci. 27, 2075–2097. doi: 10.5194/hess-27-2075-2023

Crossref Full Text | Google Scholar

Sherman, C. W. (1931). Frequency and intensity of excessive rainfalls at Boston, Massachusetts. Trans. Am. Soc. Civ. Eng. 95, 951–960. doi: 10.1061/TACEAT.0004286

Crossref Full Text | Google Scholar

Soto-Escobar, C., Zambrano-Bigiarini, M., Tolorza, V., and Garreaud, R. (2025). Gridded intensity-duration-frequency (IDF) curves: understanding precipitation extremes in a drying climate. EGUsphere 2025, 1–44. doi: 10.5194/egusphere-2025-621

Crossref Full Text | Google Scholar

Thacker, N. A., Twining, C. J., Tar, P. D., Notley, S., and Ramesh, V. (2020). Fundamental issues regarding uncertainties in artificial neural networks

Google Scholar

Ulrich, J., Fauer, F. S., and Rust, H. W. (2021). Modeling seasonal variations of extreme rainfall on different timescales in Germany. Hydrol. Earth Syst. Sci. 25, 6133–6149. doi: 10.5194/hess-25-6133-2021

Crossref Full Text | Google Scholar

U.S. Geological Survey. (2018). Landsat 8 (L8) Data Users Handbook. LSDS-1574, Version 4.0). Department of the Interior, U.S. Geological Survey, Earth Resources Observation and Science Center. Available online at: https://www.usgs.gov/media/files/landsat-8-data-users-handbook

Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). Attention is all you need

Google Scholar

Wang, C. (2024). Calibration in deep learning: A survey of the state-of-the-art

Google Scholar

Wang, Y., and McBean, E. (2014). Uncertainty characterization of rainfall inputs used in the design of storm sewer infrastructure. J. Water Manag. Model. doi: 10.14796/JWMM.C367

Crossref Full Text | Google Scholar

Wehner, M. F., Duffy, M. L., Risser, M., Paciorek, C. J., Stone, D. A., and Pall, P. (2024). On the uncertainty of long-period return values of extreme daily precipitation. Front. Clim. 6:1343072. doi: 10.3389/fclim.2024.1343072

Crossref Full Text | Google Scholar

Weibull, W. (1951). A statistical distribution function of wide applicability. J. Appl. Mech. 18, 293–297. doi: 10.1115/1.4010337

Crossref Full Text | Google Scholar

Yassin, N. (2012). Beirut. Cities 29, 64–73. doi: 10.1016/j.cities.2011.02.001

Crossref Full Text | Google Scholar

Keywords: extreme precipitation, intensity-duration-frequency, machine learning, satellite-based rainfall, uncertainty

Citation: Dargham E and Andraos C (2026) Development of intensity-duration-frequency curves using machine learning and satellite-derived precipitation data. Front. Water. 8:1727182. doi: 10.3389/frwa.2026.1727182

Received: 17 October 2025; Revised: 30 December 2025; Accepted: 02 January 2026;
Published: 29 January 2026.

Edited by:

Dagang Wang, Sun Yat-sen University, China

Reviewed by:

Jingyu Wang, Nanyang Technological University, Singapore
Milan Stojkovic, Institute for Artificial Intelligence R&D Serbia, Serbia
Athanasios Serafeim, University of Peloponnese, Greece
Ziaul Haq Doost, King Fahd University of Petroleum and Minerals, Saudi Arabia

Copyright © 2026 Dargham and Andraos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Cynthia Andraos, Y3ludGhpYS5hbmRyYW9zMkB1c2ouZWR1Lmxi

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.