Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Robot. AI

Sec. Robot Learning and Evolution

Discovery of Skill Switching Criteria for Learning Agile Quadruped Locomotion

Provisionally accepted
  • 1University of Edinburgh, Edinburgh, United Kingdom
  • 2University of Oxford, Oxford, United Kingdom
  • 3University College London, London, United Kingdom
  • 4Chongqing University, Chongqing, China

The final, formatted version of the article will be published soon.

This paper develops a hierarchical learning and optimization framework that can learn and achieve well-coordinated multi-skill locomotion. The learned multi-skill policy can switch between skills automatically and naturally in tracking arbitrarily positioned goals and recover from failures promptly. The proposed framework is composed of a deep reinforcement learning process and an optimization process. First, the contact pattern is incorporated into the reward terms for learning different types of gaits as separate policies without the need for any other references. Then, a higher level policy is learned to generate weights for individual policies to compose multi-skill locomotion in a goal-tracking task setting. Skills are automatically and naturally switched according to the distance to the goal. The proper distances for skill switching are incorporated in reward calculation for learning the high level policy and updated by an outer optimization loop as learning progresses. We first demonstrated successful multi-skill locomotion in comprehensive tasks on a simulated Unitree A1 quadruped robot. We also deployed the learned policy in the real world showcasing trotting, bounding, galloping, and their natural transitions as the goal position changes. Moreover, the learned policy can react to unexpected failures at any time, perform prompt recovery, and resume locomotion successfully. Compared to baselines, our proposed approach achieves all the learned agile skills with improved learning performance, enabling smoother and more continuous skill transitions.

Keywords: deep reinforcement learning, gait transitions, Hierarchical learning and optimization, legged locomotion, Multi-skill locomotion, robot learning, Skill switching

Received: 01 Sep 2025; Accepted: 05 Jan 2026.

Copyright: © 2026 Yu, Acero, Atanassov, Yang, Havoutis, Kanoulas and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Wanming Yu
Dimitrios Kanoulas

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.