Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Water

Sec. Water Resource Management

Volume 7 - 2025 | doi: 10.3389/frwa.2025.1649284

Deep Reinforcement Learning for Complex Hydropower Management: Evaluating Soft Actor-Critic with a Learned System Dynamics Model

Provisionally accepted
  • 1Rey Juan Carlos University, Móstoles, Spain
  • 2European Commission Joint Research Centre Ispra, Ispra, Italy
  • 3Universidad de Alcala, Alcala de Henares, Spain
  • 4Corporacion electrica del Ecuador, CELEC EP, Quito, Ecuador

The final, formatted version of the article will be published soon.

Optimizing the operation of interconnected hydropower systems presents significant challenges due to complex non-linear dynamics, hydrological uncertainty, and the need to balance competing objectives like economic maximization and operational safety. Traditional optimization methods often struggle with these complexities, particularly for high-resolution intraday decision-making. This paper proposes and evaluates a Deep Reinforcement Learning (DRL) framework, specifically utilizing the Soft Actor-Critic (SAC) algorithm, to optimize the hourly operation of the Baba hydropower facility and its strategic water transfers to the downstream Marcel Laniado De Wind (MLDW) system in Ecuador's Guayas basin.A key component of our approach is a custom Gymnasium simulation environment incorporating a validated internal dynamics model based on a pre-trained neural network. This learned model, developed using historical inflow data, accurately simulates the system's hydraulic and energy state transitions. The SAC agent was trained within this environment using synthetically generated data (KNN-resampled) to learn policies that maximize the combined economic revenue from Baba generation and the estimated downstream MLDW generation benefit, while adhering to stringent operational and safety constraints. Results demonstrate that the learned SAC policies significantly outperform historical operations, achieving up to a 9.43% increase in total accumulated economic gain over a decade-long validation period. Furthermore, the agent effectively learned to manage constraints, notably reducing peak uncontrolled spillway discharges by up to 9%. This study validates the effectiveness of SAC combined with a learned internal dynamics model as a robust, data-driven approach for optimizing complex, interconnected hydropower systems, offering a promising pathway towards more efficient and resilient water resource management.

Keywords: Water Resources Management, reservoir operation, Deep ReinforcementLearning, Hydropower optimization, soft actor-critic

Received: 18 Jun 2025; Accepted: 28 Aug 2025.

Copyright: © 2025 Udias and Campo Carrera. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Angel Udias, Rey Juan Carlos University, Móstoles, Spain

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.