AUTHOR=Huang Kuo-Liang , Duan Sheng-Feng , Lyu Xi 

TITLE=Affective Voice Interaction and Artificial Intelligence: A Research Study on the Acoustic Features of Gender and the Emotional States of the PAD Model

JOURNAL=Frontiers in Psychology

VOLUME=Volume 12 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2021.664925

DOI=10.3389/fpsyg.2021.664925

ISSN=1664-1078

ABSTRACT=New types of artificial intelligence products are gradually transferring to the use of voice interaction modes, with the demand for intelligent products expanding from communication to the recognition of users’ emotions and instantaneous feedback. At present, affective acoustic models are constructed through deep learning and abstracted into a mathematical model, making computers learn from data and equipping them with prediction abilities. Although this method can result in accurate predictions, it has a limitation in that it lacks explanatory capability; there is an urgent need for empirical study of the connection between acoustic features and psychology as the theoretical basis for the adjustment of model parameters. Accordingly, this study focuses on exploring the differences between seven major “acoustic features” and their physical characteristics during voice interaction with the recognition and expression of “gender” and “emotional states of PAD model”. In this study, a total of 31 females and 31 males aged between 21 and 60 years were invited using the stratified random sampling method for audio recording of different emotions. Subsequently, parameter values of acoustic features were extracted using Praat voice software. The parameter values were analyzed using a “Two-way ANOVA, mixed design” analysis in SPSS software. Results revealed “gender” and “emotional states of PAD model” have significant differences in seven major “acoustic features”: (1) all six acoustic features of Fo (Hz), Fo SD, Intensity (dB), Jitter%, Shimmer% and HNR interact with each other, except the “velocity”; (2) with respect to velocity of different emotions, males show significantly higher velocity (s per word) (M = .29) than females (M = .33); and (3) among the six interacting acoustic features, with the “simple main effect of gender”, there are significant differences between females and males in terms of degree and ranking of “emotional state”, except males in Fo SD. With respect to different emotions in “simple main effect of emotion state”, all six acoustic features have significant differences in degrees and ranks.