AUTHOR=Campo Carrera José María , Udias Angel TITLE=Deep Reinforcement Learning for complex hydropower management: evaluating Soft Actor-Critic with a learned system dynamics model JOURNAL=Frontiers in Water VOLUME=Volume 7 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/water/articles/10.3389/frwa.2025.1649284 DOI=10.3389/frwa.2025.1649284 ISSN=2624-9375 ABSTRACT=IntroductionOptimizing the operation of interconnected hydropower systems presents significant challenges due to complex non-linear dynamics, hydrological uncertainty, and the need to balance competing objectives like economic maximization and operational safety. Traditional optimization methods often struggle with these complexities, particularly for high-resolution intraday decision-making.MethodsThis paper proposes and evaluates a Deep Reinforcement Learning (DRL) framework, specifically utilizing the Soft Actor-Critic (SAC) algorithm, to optimize the hourly operation of the Baba hydropower facility and its strategic water transfers to the downstream Marcel Laniado De Wind (MLDW) system in Ecuador's Guayas basin. A key component of our approach is a custom Gymnasium simulation environment incorporating a validated internal dynamics model based on a pre-trained neural network. This learned model, developed using historical inflow data, accurately simulates the system's hydraulic and energy state transitions. The SAC agent was trained within this environment using synthetically generated data (KNN-resampled) to learn policies that maximize the combined economic revenue from Baba generation and the estimated downstream MLDW generation benefit, while adhering to stringent operational and safety constraints.ResultsResults demonstrate that the learned SAC policies significantly outperform historical operations, achieving up to a 9.43% increase in total accumulated economic gain over a decade-long validation period. Furthermore, the agent effectively learned to manage constraints, notably reducing peak uncontrolled spillway discharges by up to 9%.DiscussionThis study validates the effectiveness of SAC combined with a learned internal dynamics model as a robust, data-driven approach for optimizing complex, interconnected hydropower systems, offering a promising pathway toward more efficient and resilient water resource management.