ORIGINAL RESEARCH article
Front. Water
Sec. Water Resource Management
Volume 7 - 2025 | doi: 10.3389/frwa.2025.1649284
Deep Reinforcement Learning for Complex Hydropower Management: Evaluating Soft Actor-Critic with a Learned System Dynamics Model
Provisionally accepted- 1Rey Juan Carlos University, Móstoles, Spain
- 2European Commission Joint Research Centre Ispra, Ispra, Italy
- 3Universidad de Alcala, Alcala de Henares, Spain
- 4Corporacion electrica del Ecuador, CELEC EP, Quito, Ecuador
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Optimizing the operation of interconnected hydropower systems presents significant challenges due to complex non-linear dynamics, hydrological uncertainty, and the need to balance competing objectives like economic maximization and operational safety. Traditional optimization methods often struggle with these complexities, particularly for high-resolution intraday decision-making. This paper proposes and evaluates a Deep Reinforcement Learning (DRL) framework, specifically utilizing the Soft Actor-Critic (SAC) algorithm, to optimize the hourly operation of the Baba hydropower facility and its strategic water transfers to the downstream Marcel Laniado De Wind (MLDW) system in Ecuador's Guayas basin.A key component of our approach is a custom Gymnasium simulation environment incorporating a validated internal dynamics model based on a pre-trained neural network. This learned model, developed using historical inflow data, accurately simulates the system's hydraulic and energy state transitions. The SAC agent was trained within this environment using synthetically generated data (KNN-resampled) to learn policies that maximize the combined economic revenue from Baba generation and the estimated downstream MLDW generation benefit, while adhering to stringent operational and safety constraints. Results demonstrate that the learned SAC policies significantly outperform historical operations, achieving up to a 9.43% increase in total accumulated economic gain over a decade-long validation period. Furthermore, the agent effectively learned to manage constraints, notably reducing peak uncontrolled spillway discharges by up to 9%. This study validates the effectiveness of SAC combined with a learned internal dynamics model as a robust, data-driven approach for optimizing complex, interconnected hydropower systems, offering a promising pathway towards more efficient and resilient water resource management.
Keywords: Water Resources Management, reservoir operation, Deep ReinforcementLearning, Hydropower optimization, soft actor-critic
Received: 18 Jun 2025; Accepted: 28 Aug 2025.
Copyright: © 2025 Udias and Campo Carrera. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Angel Udias, Rey Juan Carlos University, Móstoles, Spain
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.