Leveraging Foundation Models for Interactive and Adaptive Robot Policy Learning

  • 3,379

    Total views and downloads

About this Research Topic

Submission deadlines

  1. Manuscript Submission Deadline 19 December 2025

  2. This Research Topic is currently accepting articles.

Background

An intelligent AI agent must learn interactively from its environment and adapt its decision-making to handle novel situations. This adaptability is particularly crucial for robots operating under real-world, open-world, and partially observable conditions, where they encounter unforeseen circumstances. Effective Human-AI interaction and collaboration strategies can help robots refine their policies or learn new ones and thereby, operate more intelligently in dynamic environments. The recent emergence of Foundation Models (FMs), such as Large Language Models (LLMs), Large Vision Models (LVMs), Large Multimodal Vision-Language Models (LVLMs), and Robot Foundation Models (RFMs), presents new opportunities for interactive and adaptive robot policy learning. These models, which excel in context understanding, compositional reasoning and decision-making tasks, provide powerful tools for enhancing and refining robot policies. By leveraging these generalist models, robots can more effectively communicate with humans, adapt to new situations and optimize their policies for real-world applications.

This Research Topic aims to investigate the transformative potential of foundation models (FMs) for effective and adaptive robot-policy learning. We seek to highlight cutting-edge research and innovative applications of these models, showcasing novel adaptation algorithms and human-robot interaction and collaboration strategies. Our focus is on how robots, empowered by off-the-shelf and/or domain-adapted foundation models, can perceive, reason, interact, and make intelligent decisions in open-world domains. By enabling effective and generalizable perception, reasoning, and multi-turn human-robot interactions, we aim to allow robots to acquire knowledge actively from external environments and humans in a targeted manner, interpret and reason about information in relation to its multi-modal context, facilitating new policy adaptation, refining existing policies, and enhancing its task and motion planning ability.

We invite researchers and practitioners to submit original research, review articles, case studies, and technical notes that explore, but are NOT limited to, the following areas:
- Applications of off-the-shelf Foundation Models for Robot Policy Learning
- Embodied Multi-modal Vision and Language models
- Efficient Domain Adaptation of Foundation Models for Policy Learning
- Interactive Robot Policy Learning and Grounding
- Human-robot Collaboration for Policy Learning
- Policy Learning with Few-shot Demonstrations
- Data Augmentation using Foundation Models
- Interactive Reasoning and Task Planning with Foundation Models
- Integrated Planning and Foundation Models
- Applications and Fine-tuning of Foundation Models for Task and Motion Planning
- Learning Safe Policy through Human-Robot Interaction
- Trustworthy AI and AI Safety
- In-context Learning (ICL) for Decision-Making
- Knowledge Representation and Reasoning for Agents
- Interactive Open-Vocabulary Robot Navigation and Manipulation
- Policy Correction and Adaption through Human-robot Interaction
- Policy Evaluation using Foundation Models
- Applications and Fine-tuning of Foundation Models for Spoken Dialogue
- Detecting and Adapting to Novelty in Open-world Environments

Article types and fees

This Research Topic accepts the following article types, unless otherwise specified in the Research Topic description:

  • Brief Research Report
  • Data Report
  • Editorial
  • FAIR² Data
  • General Commentary
  • Hypothesis and Theory
  • Methods
  • Mini Review
  • Opinion

Articles that are accepted for publication by our external editors following rigorous peer review incur a publishing fee charged to Authors, institutions, or funders.

Keywords: LLM, Large Language Model, Large Vision Models (LVMs), Large Multimodal Vision-Language Models, Robot Foundation Models, Robot policy learning, interactive learning, Human-Robot interaction, Human-Robot Collaboration, policy adaptation

Important note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic editors

Manuscripts can be submitted to this Research Topic via the main journal or any other participating journal.

Impact

  • 3,379Topic views
  • 1,069Article views
View impact