ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Machine Learning and Artificial Intelligence
This article is part of the Research TopicPhysical AI and Robotics – Outputs from IS-PAIR 2025 and BeyondView all 6 articles
Improvements to Dark Experience Replay and Reservoir Sampling for Better Balance Between Consolidation and Plasticity
Provisionally accepted- 1National Institute of Informatics, Chiyoda-ku, Japan
- 2Sogo Kenkyu Daigakuin Daigaku, Miura District, Japan
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Continual learning is one of the most essential abilities for autonomous agents, which can incrementally learn daily-life skills even with limited computer resources. To achieve this goal, a simple yet powerful method called dark experience replay (DER) was recently proposed. DER mitigates catastrophic forgetting, where the skills acquired in the past are unintentionally forgotten when learning new skills, by stochastically storing streaming data in a reservoir sampling (RS) buffer and relearning them or retaining their past outputs. However, because DER considers multiple objectives, it does not function properly without appropriate weighting for each problem. In addition, the ability to retain past outputs inhibits learning if past outputs are inconsistent owing to distribution shifts or other effects. This is because of the trade-off between memory consolidation and plasticity. The trade-off is hidden even in the RS buffer, which gradually stops storing new data for new skills as data are continuously passed to it. To alleviate this trade-off and achieve a better balance, this study proposes improvement strategies for each DER and RS. Specifically, DER is improved by the automatic adaptation of weights, blocking of replaying inconsistent data, and correction of past outputs. RS is also improved with the generalization of acceptance probability, stratification of multiple buffers, and intentional omission of inconsistent data. These improvements were verified using multiple benchmarks including regression, classification, and reinforcement learning problems. Consequently, the proposed methods achieved a steady improvement in learning performance by balancing memory consolidation and plasticity.
Keywords: Consolidation and plasticity, continual learning, Dark experience replay, reinforcement learning, Reservoir sampling
Received: 18 Jun 2025; Accepted: 28 Jan 2026.
Copyright: © 2026 Kobayashi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Taisuke Kobayashi
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.