Frontiers | Bridging Foundation Models and Human-Centered Interaction in Multimodal AI

About this Research Topic

Submission deadlines

Manuscript Submission Deadline 27 April 2026
This Research Topic is currently accepting articles.
1. Check author guidelines

Background

Human–computer interaction is currently experiencing a transformative shift into the multimodal era, wherein diverse senses such as language, vision, audio, and spatial context are harmoniously integrated into everyday devices. Foundation models, including Large Language Models and Vision–Language Models, have paved the way for interpreting complex scenes, understanding dialogue, providing adaptive guidance, and personalizing content across an array of platforms, from mobile and desktop to the emerging realm of Extended Reality (XR). Despite these advancements, practical and societal challenges remain; the quest for consistent performance beyond controlled environments and a thorough evaluation of user experience, privacy, safety, and accessibility in real-world applications remain pertinent questions. This Research Topic aims to advance the understanding and integration of multimodal AI into human-centered interaction, offering a platform for discussions and innovations that connect multimodal AI to the Human–Media Interaction community.

This Research Topic aims to merge the advanced capabilities of modern foundation models with the demand for robust, trustworthy, and inclusive interactive systems. Researchers are encouraged to explore how multimodal perception and language-based reasoning can enhance real-time comprehension of user intentions and contextual interactions, consequently informing interface designs across diverse settings, both screen-based and immersive. Particular emphasis is placed on research that elucidates or enhances reliability and efficiency under realistic conditions; proposes principled evaluation methodologies that marry task performance with user-centric metrics such as usability, workload, comfort, fairness, and accessibility; and delves into the realm of responsible data practices and governance tailored for sensing and adaptive systems.

To gather further insights in the field of human-centered multimodal interaction, we welcome articles addressing, but not limited to, the following themes:

- Multimodal models and prompting strategies for interaction
- Design and evaluation methodologies for adaptive interfaces across web, mobile, and XR platforms
- Innovations in user modeling and personalization
- Development of datasets, benchmarks, and reproducible protocols
- Considerations of efficiency and deployment including latency and edge/on-device processing
- Explorations of transparency, safety, privacy, ethics, and accessibility in multimodal AI
- Application studies in various sectors such as education, health, industry, culture, or tourism

Manuscripts may be submitted in the following categories: Original Research, Methods/Technology Reports, Data Reports, Systematic Reviews/Reviews, Brief Research Reports, and Perspectives.

Article types and fees

This Research Topic accepts the following article types, unless otherwise specified in the Research Topic description:

Brief Research Report
Conceptual Analysis
Curriculum, Instruction, and Pedagogy
Data Report
Editorial
FAIR² Data
FAIR² DATA Direct Submission
General Commentary
Hypothesis and Theory

Articles that are accepted for publication by our external editors following rigorous peer review incur a publishing fee charged to Authors, institutions, or funders.

Keywords: Multimodal AI, Large Language Models, Vision–Language Models, Extended Reality, Human–Computer Interaction

Important note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic editors

Share on

Frontiers in Computer Science

Human-Media Interaction

Manuscripts can be submitted to this Research Topic via the main journal or any other participating journal.

Impact

538Topic views

View impact

Bridging Foundation Models and Human-Centered Interaction in Multimodal AI

About this Research Topic

Background

Article types and fees

Topic editors

andrea generosi

maura mengoni

Frontiers in Computer Science

Human-Media Interaction