Do you like me? Behavioral and physical features for socially and emotionally engaging interactive systems

With the aim to give an overview of the most recent discoveries in the field of socially engaging interactive systems, the present paper discusses features affecting users' acceptance of virtual agents, robots, and chatbots. In addition, questionnaires exploited in several investigations to assess the acceptance of virtual agents, robots, and chatbots (voice only) are discussed and reported in the Supplementary material to make them available to the scientific community. These questionnaires were developed by the authors as a scientific contribution to the H2020 project EMPATHIC (http://www.empathic-project.eu/), Menhir (https://menhir-project.eu/), and the Italian-funded projects SIROBOTICS (https://www.exprivia.it/it-tile-6009-si-robotics/) and ANDROIDS (https://www.psicologia.unicampania.it/android-project) to guide the design and implementation of the promised assistive interactive dialog systems. They aimed to quantitatively evaluate Virtual Agents Acceptance (VAAQ), Robot Acceptance (RAQ), and Synthetic Virtual Agent Voice Acceptance (VAVAQ).


. Introduction
Socially engaging interactive systems can be virtual agents, robots which physically occupy the user's space, and chatbots (also intended as conversational voice interfaces). There are several factors affecting the way these different technological entities are accepted by their users. User acceptance can be defined ". . . as the demonstrable willingness within a user group to employ information technology for the tasks it is designed to support" (Dillon, 1996). User acceptance is different from other concepts like user experience (UX), system quality, and usability, since acceptance can be considered as their result, consisting of something that comes after and contains them. It has been shown over time that user attraction cannot be reduced to the perceived usefulness of a system and its ease of use (Davis, 1989). In fact, theoretical constructs such as a user's social influence and the accomplishment of significant user goals together with hedonic motivations (the fun or pleasure of using a technology), price values (a trade-off between perceived benefits and monetary costs), and users' habits must be considered as further determinants affecting users' intentions to use such systems (Venkatesh et al., 2003(Venkatesh et al., , 2012. These constructs have been operationalized through well-known theoretical models such as the Technology Acceptance Model (TAM; Davis, 1989), which evolved into the Unified Theory of Acceptance and Use of . /fcomp. . Technology (UTAUT; Venkatesh et al., 2003), and lately into UTAUT2 (Venkatesh et al., 2012), as well the Almere model developed as a further evolution of UTAUT2 upon the criticism that the latter does not account for variables related to social interactions with robots or virtual agents and does not consider seniors as potential users (Heerink et al., 2010;Tsiourti et al., 2014). However, these theoretical formulations are not able, in our opinion, to explain behavioral intention in different contexts, especially considering that contemporary proposed interactive systems are increasingly more complex, showing humanoid and human appearances. To this aim, a systematic investigation was conducted to assess the effects of behavior and appearance-related features of virtual agents, robots, and synthetic voices on users' acceptance; specific user domain preferences were exploited in the context of healthcare and particularly in the context of the H2020 projects EMPATHIC (http://www.empathic-project.eu/), Menhir (http://www.empathic-project.eu/), and the Italian-funded projects SIROBOTICS (https://www.exprivia.it/it-tile-6009-sirobotics/) and ANDROIDS (https://www.psicologia.unicampania. it/android-project) to guide the design and implementation of the promised assistive interactive technologies. These projects brought with them the promise of guiding the implementation of virtual coaches in order to simplify and make independent the life of elderly people living alone, and at the same time monitoring their mental health status. What is fundamental is focusing on the possibility of exploiting intelligent and socially believable Information Communication Technology (ICT) interfaces that support seniors in living autonomously, simplifying their management of daily tasks, and lightening workloads for caregivers. Moreover, it is possible to use technology such as robots, chatbots, and virtual agents, to help not only elders but anyone requiring support for daily life activities. For instance, conversational technologies in the shape of virtual agents, social robots, and chatbots can be exploited to improve users' mental wellbeing and lifestyles. These systems can be used as diagnostic tools for monitoring and treating symptoms of mental health conditions (Lovejoy, 2019), assessing users' tendency to engage in risky health behaviors (Elmasri and Maeder, 2016), encouraging users to adopt behaviors to increase wellbeing and reduce stress (Gardiner et al., 2017), and monitoring conversations with users and detecting the presence of depressive symptoms (Delahunty et al., 2018). Conversational agents can be exploited as mental health tools with the aim of providing support to people living with post-traumatic stress disorder (PTSD; Tielman et al., 2017), schizophrenia (Huckvale et al., 2013), phobias (Brinkman et al., 2008), major depression (Pérez Díaz de Cerio et al., 2011), and children with autism (Bernardini et al., 2013). These aspects were more deeply investigated in the context of the MENHIR (https://menhir-project.eu), aimed at researching and developing conversational technologies to promote mental health and assist people with mental health conditions (e.g., depression and anxiety) to manage their conditions. Since the successful incorporation of assistive technologies in everyday life depends mainly on how the users perceive and accept these assistive technologies (De Graaf et al., 2015), the authors' work focused on investigating these issues by adopting a user-centered perspective. Therefore, the present work summarizes these investigations, providing: • an overview of the factors affecting users' perception of virtual agents (Section 2).
• an overview of the factors affecting users' perception of robots (Section 3).
• an overview of the factors affecting users' perception of chatbots (Section 4).
• a detailed description of the questionnaires exploited to carry out the Virtual Agents Acceptance Questionnaire (VAAQ), the Robot Acceptance Questionnaire (RAQ), and the Synthetic Virtual Agent Voice Acceptance Questionnaire (VAVAQ; Section 5).

. Features for accepting virtual agents
Virtual agents are cyber entities capable of communicating using human-like communicative modalities, such as voice, facial expressions, and body movements (Pelachaud, 2009). The appearance of virtual agents has a strong impact on the degree of users' acceptance. Appearance includes the physical and social features of the agent, such as its face, voice, gender, dressing style, and personality (Díaz-Boladeras et al., 2013;Esposito et al., 2021). An agent's voice and face has a strong impact on users' perceptions, as shown by studies highlighting people's skeptical reactions toward agents developed using the combination of a human face with a synthetic voice or a synthetic face with a human voice (Gong and Nass, 2007). Studies have highlighted that senior users prefer human-like agents rather than machine-or animal-like ones (Straßmann and Krämer, 2017) and that people consider humanoid agents with a cartoon-like appearance more pleasant compared to realistic humanoid agents (Ring et al., 2014). Even children, when required to recognize realistic and stylized facial emotional expressions, seem to prefer stylized faces for the identification of surprise (Esposito et al., 2013). Esposito et al. (2019a) observed that voice seems to be a fundamental factor in increasing senior users' acceptance of virtual agents, while young adults and adolescents seem to not be strongly influenced by agents' voices. Moreover, interfaces endowed with a human face are able to improve employers' productivity (Kong, 2013) and virtual agents with human-like faces induce more positive user reactions compared to agents with animal-like or cartoon-like faces (Forluzzi et al., 2007;Oh et al., 2016). Gender has also been found to impact users' willingness to interact and is a factor capable of strongly influencing users' beliefs and expectations (Niculescu et al., 2010;Esposito et al., 2018b). One study (Ashby Plant et al., 2009) showed students had higher performances, increased interest, and feelings of self-efficacy while interacting with a female agent. Other studies in which seniors were involved highlighted that they noticeably enjoyed interacting with a synthetic speaking voice produced by a static female agent (Cordasco et al., 2014). A further study (Esposito et al., 2018b) investigating seniors' preferences highlighted that they assessed female humanoid agents as more pleasant, practical, and attractive than male agents and that they were more prone to engage in long-lasting interactions with them. As mentioned above, the virtual agents' dressing style is another variable affecting users' attitudes toward virtual agents, and dressing style significantly interacts with gender. To this aim, Lunardo .
(2016) showed that female virtual agents presented while wearing corporate clothing were evaluated as more attractive compared to male agents, which increased social presence and trust and had a positive effect on online consumer behavior. Virtual agents' gender also seems to interact with other variables such as the level of agents' realism. More specifically users seemed to prefer interacting with female virtual agents characterized by a more realistic appearance (Payne et al., 2013). Agents' behavior is another aspect influencing users; indeed, a study on persuasion showed that virtual agents characterized by higher behavioral realism were more convincing than those lower in behavioral realism (Guadagno et al., 2007). Even an agent's perceived personality has an impact on users, as shown in work by Esposito et al. (2018a) involving seniors. In this context, seniors expressed preferences for interacting with virtual agents showing joyful and practical personalities rather than sad and aggressive traits. A further crucial aspect related to the appearance and the design of a virtual assistant concerns a virtual agent's ability to manifest emotional expressions. Emotions are crucial for humans' survival and social adaptation, and they also represent a fundamental component of human-machine interaction. In fact, people prefer to interact with virtual agents which can show emotional facial expressions rather than with unemotional virtual agents (Gobron et al., 2013). It has been shown (de Melo, et al., 2014) that during negotiation processes with virtual agents, people tended to concede more if the agent expressed anger or blame compared to conditions in which the agent expressed happiness. But, when participants had the possibility to choose between accepting or rejecting an offer, they tended to accept an offer from agents expressing emotions such as joy. Alternatively, they tended to reject offers and withdraw from the negotiation when a virtual agent expressed anger or sadness. Other studies investigated the effect that facial emotional expressions conveyed by virtual agents can exert on the user within the interaction process. Bartneck et al. (2007) developed an investigation in which participants were asked to join a negotiation task in which they were asked to interact with a screen or with a robotic character. In both conditions, it emerged that participants rated the interaction with the characters expressing emotions as more pleasant compared to the emotionless characters. In summary, an assistive technology embodied in a virtual agent is more appealing to a population ranging from 14 to 65+ years old when implemented as a female virtual agent, aged between 29 and 35 years, and with a pragmatic and/or joyful personality.

. Features for accepting robots
As with virtual agents, users' acceptance of socially assistive robots (SARs) is affected by several features. To the same extent as in human interactional exchanges, facial features, gender, age, and ethnicity represent sources for users to understand and accept the assistance of a robot (Smarr et al., 2011). Robots' appearance is one of the most important factors in determining people's preferences. A major distinction occurs between humanoid robots, characterized by a human-like appearance, and android robots, which instead mimic a realistically human appearance. Since the formulation of the uncanny valley theory by Mori (1970), which is used to describe people's reactions to robots and how these reactions vary according to the distinct levels of perceived human likeness, several studies have focused on testing the effects of different levels of human likeness on users' acceptance of robots. Regarding the feeling of eeriness that a robot could cause in users, different explanations have been proposed: according to some studies, it is the stimulus category (human vs. non-human) that causes the uncanny valley effects rather than its level of human likeness (Burleigh et al., 2013); other studies have highlighted that this effect is due to the inadequacy of the rendering of some human-like characteristics of the robot, for instance, the robots' slow movements or poor lexicons (Wang et al., 2015), as well as the lack of consistency and reduced realism in human eyes-eyelashesmouth, skin-nose-eyebrows (MacDorman and Chattopadhyay, 2015). Other studies investigating potential user attitudes toward robots identified a clear uncanny valley effect since humanoid robots were evaluated as more friendly and pleasant (MacDorman and Ishiguro, 2006;Wu et al., 2012;Mara and Appel, 2015;Ferrari et al., 2016), as well as being more suitable for performing assistive duties, protection and security tasks, and front desk occupations (Esposito et al., 2020a) than androids. Nevertheless, in contrast with the current trend observed in the literature, some studies (Esposito et al., 2019c(Esposito et al., , 2020b have highlighted seniors' preference for androids rather than humanoid robots. It has also been shown that people tend to attribute racial/ethnic identities to robots (Sparrow, 2020); thus, it follows that a robot's ethnicity could have a strong impact on users' acceptance, as shown by Esposito et al. (2020b), where seniors' preferences were for female android robots with Asian traits and male androids with Caucasian traits. Nevertheless, all these factors (e.g., the levels of human likeness, gender, and ethnicity) do not seem to uniquely affect users' acceptance, but rather user acceptance appears to be a non-linear combination of all these factors (Esposito et al., 2022). User acceptance of robots depends not only on characteristics that the robot should have but also on features that the user prefers the robot would not have, as in a study (de Graaf et al., 2019) in which participants negatively evaluated the sociability and the companionship possibilities of domestic robots, suggesting that people seemed to not want robots to behave socially.
To summarize, the acceptance of assistive technologies embodied in social robots is more difficult because of the difficulty to implement (up to now) robots adequately rendering human appearance in movements, facial expressions (eyes-eyelashesmouth, skin-nose-eyebrows), and language.

. Features for accepting chatbots
A chatbot consists of an interactive interface based on a computer software able to simulate human conversations through natural language (Beilby et al., 2014). To be successfully accepted by users, chatbots should possess certain features, for instance, the ability to easily start an interaction, to precisely understand a user's words, be trustworthy, and provide correct and relevant answers, as well as having the ability to express emotions (Tatai et al., 2003;Zamora, 2017;Zumstein and Hundertmark, 2017). Rietz et al. (2019) examined the influence of anthropomorphic chatbot design features on user acceptance, highlighting that this design characteristic increases chatbot perceived usefulness. Language .
/fcomp. . style is a fundamental feature to consider while designing chatbots, as shown in the study of Gnewuch et al. (2020) in which chatbots with different language styles, that is, dominant and submissive, were exploited. The study highlighted that when the user perceived a similarity between their own and the chatbot's language style, it increases the user's degree of self-disclosure and chatbot acceptance. A further way to increase chatbot acceptance consists of providing the chatbot with a synthetic voice, such as other well-known speech-based technologies such as Alexa and Google Assistant. Some guidelines concerning the characteristics that a synthetic voice should have in order to meet users' expectations are derived from studies investigating the role that the voice plays in the acceptance of virtual agents. These studies highlighted that potential users prefer to interact with synthetic voices, even if they are not equipped with a visual interface or virtual avatar, rather than interact with mute agents (Esposito et al., 2021). A recent study (Amorese et al., 2023) analyzed the effect of synthetic voices' gender and quality on user's preferences involving mental health experts and participants living with depression and/or anxiety. The results showed that participants' preferences seemed to be affected by both the gender and quality of the synthetic voice. More specifically, participants preferred female voices and high-quality voices. It also emerged that the quality of a synthetic voice in particular seemed to have a stronger impact on users' evaluations compared to the voice's gender. Table 1 summarizes all the factors affecting users' acceptance as discussed above.
. Questionnaires to assess acceptance of virtual agents, robots, and chatbots (voice only) With the aim of testing the previously mentioned factors and providing information concerning the perception of virtual agents, robots, and synthetic voices, as well as the degrees of technology acceptance among users, questionnaires were developed to explore potential users' satisfaction while interacting with virtual agents, robots, and synthetic voices, respectively named the Virtual Agents Acceptance Questionnaire (VAAQ), Robot Acceptance Questionnaire (RAQ), and Virtual Agent Voice Acceptance Questionnaire (VAVAQ). With regards to the VAVAQ, the reason why we did not use other standard questionnaires dealing with the measurement of voice quality was related to the possibility of collecting data concerning synthetic voices to compare with data concerning robots and virtual agents collected with the same tool. The questionnaires were developed taking inspiration from Hassenzahl's AttrakDiff questionnaire (2003, 2004, and 2014), thought to test the usability and appearance of interactive products (i.e., enterprise software, consumer products, websites, or medical devices) and distinguishing between pragmatic and hedonic factors. The VAAQ, RAQ, and VAVAQ (reported in the Supplementary material section) questionnaires are composed of seven sections. Within the Supplementary material, only one questionnaire is reported: this single questionnaire can be used to measure any of the three systems specifically, since the questions are the same and what changes is only the type of system being evaluated. Moreover, the Supplementary material contains the abridged version of the originally developed questionnaire, this version has been modified over time, and non-descriptive items have been eliminated so as to make administration of the questionnaire less burdensome for the participants. Since the questionnaire shortening could be considered as an improvement of the questionnaire, only the abridged version is reported. The first section is composed of four items collecting socio-demographic information about participants and three items investigating participants' experiences with technology and difficulties while using devices such as smartphones, tablets, and laptops. The second section, composed of one item, evaluates participants' willingness to interact with the proposed systems. The third section investigates how participants perceive the system and consists of four subsections, each composed of six items: Subsection 1 is devoted to assessing the pragmatic qualities (PQ) of the system, regarding the system's usefulness, effectiveness, practicality, and ease of use.
Subsection 2 is devoted to assessing the hedonic qualitiesidentity (HQI) of the system, regarding the system's originality, professionality, creativeness, and pleasantness.
Subsection 3 is devoted to assessing the hedonic qualitiesfeeling (HQF) of the system, regarding the system's ability to arouse both positive and negative feelings.
Subsection 4 is devoted to assessing the attractiveness (ATT) of the system, regarding the system's attractiveness and ability to encourage increased use and long-term relationships.
The fourth section is composed of three items assessing the impact that the perceived age attributed to the agent, robot, or voice could have on the user. Section five investigates systems' perceived suitability for performing tasks in: (a) welfare occupations for seniors, children, and disabled people; (b) housework; (c) protection and security occupations; and (d) public relations and front office occupations. Section six is specifically devoted to assessing systems' voice and in particular its intelligibility, expressiveness, and naturalness. Section seven, lastly, is devoted to evaluating the possible effect of exploiting Wizard of Oz (WoZ) techniques during the interactions and thus obviously has to be administered only when WoZ procedures are involved. For each item, participants' answers were given on a 5-point Likert scale from 1 = strongly agree, 2 = agree, 3 = I don't know, 4 = disagree, to 5 = strongly disagree. Since sections two, three, six, and seven of the questionnaires are composed of both positive and negative items evaluated on a 5-point Likert scale, scores from negative items are corrected in reverse, thus low scores correspond to positive evaluations, and high scores to negative ones.
The RAQ questionnaire was recently validated using principal components analysis (PCA) and the internal consistency was checked; the work is currently under submission. We are also planning to extend the validation work to the other questionnaires (VAAQ and VAVAQ), and publish the results.

. Conclusions
In this article, we presented a brief overview of the features that socially and emotionally engaging interactive systems should possess in order to meet users' needs and expectations. We focused in particular on three typologies of interactive systems: virtual agents, robots, and chatbots. It emerged that there are several physical and behavioral features capable of affecting users' acceptance and that these even interact with each other. Considering this, and the authors' involvement, as previously mentioned, within the H2020 projects "EMPATHIC" and "MENHIR, " a systematic investigation was conducted assessing the behavioral and appearance-related features affecting users' acceptance of virtual agents, robots, and synthetic voices in the context of healthcare. This was with the hope to provide guidelines, as emphasized in the aims of the EMPATHIC project to "develop causal models of [agent] coach-user interactional exchanges, which engage elderly [sic] in emotionally believable interactions keeping off loneliness, sustaining health status, enhancing quality of life and simplifying access to future telecare services." Among the initial research steps, priority was given to the development of a special questionnaire to assess seniors' preferences toward the developed empathic virtual coach. Throughout the midterm period, the researchers from Università della Campania L. Vanvitelli developed the "Virtual Agent Acceptance Questionnaire" (VAAQ) which dynamically changed during the project to better fit the observed final users' requirements; it also gave rise to the corresponding versions of the questionnaire dedicated to robots [the Robot Acceptance Questionnaire (RAQ)] and synthetic voices [the Virtual Agent Voice Acceptance Questionnaire (VAVAQ)]. This paper in fact, also reports the final (shortened) versions of the questionnaires and results along with the testing of a large population of users including adolescents, young adults, middle-aged adults, and seniors assessing their acceptance of not only virtual agents but also interactive systems as conversational voice interfaces, and humanoid and android robots.