ORIGINAL RESEARCH article
Front. Comput. Sci.
Sec. Human-Media Interaction
Volume 7 - 2025 | doi: 10.3389/fcomp.2025.1575741
This article is part of the Research TopicEmbodied Perspectives on Sound and Music AIView all 5 articles
A Multimodal Symphony: Integrating Taste and Sound through Generative AI
Provisionally accepted- 1Department of Information Engineering, CSC - Centro di Sonologia Computazionale, University of Padova, Padua, Veneto, Italy
- 2Centre for Mind and Brain Sciences, University of Trento, Rovereto, Trentino-Alto Adige, Italy
- 3SoundFood s.r.l., Terni, Italy
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
In recent decades, neuroscientific and psychological research has traced direct relationships between taste and auditory perceptions. This article explores multimodal generative models capable of converting taste information into music, building on this foundational research. We provide a brief review of the state of the art in this field, highlighting key findings and methodologies.We present an experiment in which a fine-tuned version of a generative music model (MusicGEN) is used to generate music based on detailed taste descriptions provided for each musical piece.The results are promising: according the participants' (n = 111) evaluation, the fine-tuned model produces music that more coherently reflects the input taste descriptions compared to the nonfine-tuned model. This study represents a significant step towards understanding and developing embodied interactions between AI, sound, and taste, opening new possibilities in the field of generative AI.
Keywords: Generative AI, crossmodal correspondences, Taste, audition, Music
Received: 12 Feb 2025; Accepted: 30 May 2025.
Copyright: © 2025 Spanio, Zampini, Rodà and Pierucci. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Matteo Spanio, Department of Information Engineering, CSC - Centro di Sonologia Computazionale, University of Padova, Padua, Veneto, Italy
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.