Human–computer interaction is currently experiencing a transformative shift into the multimodal era, wherein diverse senses such as language, vision, audio, and spatial context are harmoniously integrated into everyday devices. Foundation models, including Large Language Models and Vision–Language Models, have paved the way for interpreting complex scenes, understanding dialogue, providing adaptive guidance, and personalizing content across an array of platforms, from mobile and desktop to the emerging realm of Extended Reality (XR). Despite these advancements, practical and societal challenges remain; the quest for consistent performance beyond controlled environments and a thorough evaluation of user experience, privacy, safety, and accessibility in real-world applications remain pertinent questions. This Research Topic aims to advance the understanding and integration of multimodal AI into human-centered interaction, offering a platform for discussions and innovations that connect multimodal AI to the Human–Media Interaction community.
This Research Topic aims to merge the advanced capabilities of modern foundation models with the demand for robust, trustworthy, and inclusive interactive systems. Researchers are encouraged to explore how multimodal perception and language-based reasoning can enhance real-time comprehension of user intentions and contextual interactions, consequently informing interface designs across diverse settings, both screen-based and immersive. Particular emphasis is placed on research that elucidates or enhances reliability and efficiency under realistic conditions; proposes principled evaluation methodologies that marry task performance with user-centric metrics such as usability, workload, comfort, fairness, and accessibility; and delves into the realm of responsible data practices and governance tailored for sensing and adaptive systems.
To gather further insights in the field of human-centered multimodal interaction, we welcome articles addressing, but not limited to, the following themes:
- Multimodal models and prompting strategies for interaction - Design and evaluation methodologies for adaptive interfaces across web, mobile, and XR platforms - Innovations in user modeling and personalization - Development of datasets, benchmarks, and reproducible protocols - Considerations of efficiency and deployment including latency and edge/on-device processing - Explorations of transparency, safety, privacy, ethics, and accessibility in multimodal AI - Application studies in various sectors such as education, health, industry, culture, or tourism
Manuscripts may be submitted in the following categories: Original Research, Methods/Technology Reports, Data Reports, Systematic Reviews/Reviews, Brief Research Reports, and Perspectives.
Article types and fees
This Research Topic accepts the following article types, unless otherwise specified in the Research Topic description:
Brief Research Report
Conceptual Analysis
Curriculum, Instruction, and Pedagogy
Data Report
Editorial
FAIR² Data
FAIR² DATA Direct Submission
General Commentary
Hypothesis and Theory
Articles that are accepted for publication by our external editors following rigorous peer review incur a publishing fee charged to Authors, institutions, or funders.
Article types
This Research Topic accepts the following article types, unless otherwise specified in the Research Topic description:
Brief Research Report
Conceptual Analysis
Curriculum, Instruction, and Pedagogy
Data Report
Editorial
FAIR² Data
FAIR² DATA Direct Submission
General Commentary
Hypothesis and Theory
Methods
Mini Review
Opinion
Original Research
Perspective
Policy and Practice Reviews
Registered Report
Review
Systematic Review
Technology and Code
Keywords: Multimodal AI, Large Language Models, Vision–Language Models, Extended Reality, Human–Computer Interaction
Important note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.