Diffusion Models for Robotic Manipulation: A Survey

Wolf, Rosa  Petra; Shi, Yitian; Liu, Sheng; Rayyes, Rania

doi:10.3389/frobt.2025.1606247

REVIEW article

Front. Robot. AI

Sec. Robot Learning and Evolution

Volume 12 - 2025 | doi: 10.3389/frobt.2025.1606247

Diffusion Models for Robotic Manipulation: A Survey

Provisionally accepted

Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

The final, formatted version of the article will be published soon.

Diffusion generative models have demonstrated remarkable success in visual domains such as image and video generation. They have also recently emerged as a promising approach in robotics, especially in robot manipulations. Diffusion models leverage a probabilistic framework, and they stand out with their ability to model multi-modal distributions and their robustness to high-dimensional input and output spaces. This survey provides a comprehensive review of state-of-the-art diffusion models in robotic manipulation, including grasp learning, trajectory planning, and data augmentation. Diffusion models for scene and image augmentation lie at the intersection of robotics and computer vision for vision-based tasks to enhance generalizability and data scarcity. This paper also presents the two main frameworks of diffusion models and their integration with imitation learning and reinforcement learning. In addition, it discusses the common architectures and benchmarks and points out the challenges and advantages of current state-of-the-art diffusion-based methods.

Keywords: diffusion models, robot manipulation learning, Generative models, Imitation learning, Grasp learning

Received: 04 Apr 2025; Accepted: 14 Jul 2025.

Copyright: © 2025 Wolf, Shi, Liu and Rayyes. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Rosa Petra Wolf, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.