Human–robot creative interactions: Exploring creativity in artificial agents using a storytelling game

Creativity in social robots requires further attention in the interdisciplinary field of human–robot interaction (HRI). This study investigates the hypothesized connection between the perceived creative agency and the animacy of social robots. The goal of this work is to assess the relevance of robot movements in the attribution of creativity to robots. The results of this work inform the design of future human–robot creative interactions (HRCI). The study uses a storytelling game based on visual imagery inspired by the game “Story Cubes” to explore the perceived creative agency of social robots. This game is used to tell a classic story for children with an alternative ending. A 2 × 2 experiment was designed to compare two conditions: the robot telling the original version of the story and the robot plot twisting the end of the story. A Robotis Mini humanoid robot was used for the experiment, and we adapted the Short Scale of Creative Self (SSCS) to measure perceived creative agency in robots. We also used the Godspeed scale to explore different attributes of social robots in this setting. We did not obtain significant main effects of the robot movements or the story in the participants’ scores. However, we identified significant main effects of the robot movements in features of animacy, likeability, and perceived safety. This initial work encourages further studies experimenting with different robot embodiment and movements to evaluate the perceived creative agency in robots and inform the design of future robots that participate in creative interactions.


Introduction
An important dimension in social interaction is how agents perceive the intelligence and creativity of other agents. However, creativity is an under-explored area in the study of human-robot interaction (HRI) (Saunders et al., 2013). Creativity can be defined as the capacity to imagine alternative futures. Despite its relevance in shaping everyday interactions, we have a limited knowledge of how creativity is perceived and attributed by humans to self and others. Furthermore, the display of creativity is a subjective phenomenon that is challenging to study using experimental methods of inquiry (Svanaes, 2013). This study is drawn from work in the area of self-assessment of creativity to evaluate the possible connections between the perceived creative agency and the animacy and kinesthetics of social robots.
In artificial agents such as social robots or screen-based avatars, the attribution of creativity has remained largely unaddressed in the field of HRI until recently (Alves-Oliveira et al., 2017, 2020. While artificial intelligence has been investigated extensively, and artificial creativity defined as the creativity attributed to artificial agents remains to be addressed particularly in experimental studies. Some creative behaviors have been simulated using language, pattern recognition, and evolutionary generative systems as shown (Pham et al., 2017;Gizzi et al., 2019;Myoo, 2019;Uneeq Ltd, Digital Humans, 2021;OpenAI, 2021). Similarly, some work has been performed on users' non-verbal behavior, self-presence, social presence, and interpersonal attraction in one-to-one human-agent interaction on collaborative virtual environments using realistic humanoid avatars (Herrera et al., 2020). However, in artificial agents such as social robots that rely on physical embodiment to interact with the users and the real-world, artificial creative behavior will need to be communicated to users verbally and kinesthetically, namely, using movement as part of their communicative means. To our knowledge, the display of creative behavior via physical movement (animacy) in robots is an area requiring further investigation by human-robot interaction researchers.
In human-human interaction (HHI), the study of non-verbal behavior in creative and collaborative tasks has been extensively studied. For instance, Won et al. (2014) suggest that there is a significant correlation between synchronized movements of a pair of humans and creative outcomes. Hence, the display of creative behavior via movement is likely to be of high relevance to determine the ultimate value and usefulness of social robots interacting with humans in everyday settings.
The impact of robots' physical presence and their movements has been studied previously across contexts (Vignolo et al., 2017). However, more research is needed to better understand how social robots can effectively use movement in everyday interactions, especially in light of screen-based smart applications and disembodied products that use voice interfaces. What may physical robots offer in terms of functionality and usability that screen and voice agents cannot? What are the design affordances enabled by their physicality and movement possibilities that are not available on screen or voice interaction? The work presented in this article seeks to contribute to the future design of social robots by analyzing how humans perceive the robots' movement when these perform a task that requires creative behavior, such as play.
We propose an adapted scale to measure the perceived creativity in social robots in the context of games and playful activities such as creative storytelling. This scale is inspired in the work of Karwowski et al. (2018) and Karwowski (2014) and is modified here to capture how participants rate the creative skills of robots. We are particularly focused on the domain of creative collaborative interactions that could be eventually implemented using social robots.
The work with social agents as robots is relevant because excessive screen time is associated with ergonomic, visual, and behavioral issues. Furthermore, excessive use of screens for entertainment negatively impacts people at the level of being recently classified as "gaming addiction" (WHO, 2018) and influences people's overall mood, among other effects. At the moment, users intensively use screen-based devices for both work and entertainment services, resulting in extended periods of eye strain and sedentary lifestyle (Aboujaoude and Starcevic, 2015;Alter, 2018;Desmurget, 2020).
This work is an early exploration of HRI that does not rely on a screen to interact with users. We aim to study alternatives to screen and audio interactions using physical robots for interactive creative activities, relying on more natural interactions with artificial agents. We consider that creativity is a central part of HRI due to its importance in the cognitive and social processes involved in playful interactions. Finally, we expect to contribute in the near future in the design of robots encouraging natural, long-term interactions with cognitive and social gains.

Background
The design of social robots has shown initial evidence of their potential value for usability and functionality to assist users in everyday life. So far, the main applications of notable market success have been to carpet and floor robot cleaners and toys including robotic pets for therapeutic purposes. For the last 2 decades, researchers and companies have searched for the "killer app" that builds on the affordances of physical robots to transform the lives of users around the world. In that time, screen-based devices and home assistants that use audio interfaces have made substantial gains in market penetration. Currently, there is a need to understand if and how physical robots will be of value for users in their everyday tasks as suggested by the fiction (Miller, 2021). However, to our knowledge, there are no sufficient studies using social robots aiming to study creative interactions between humans and robots, creative performance by the robot, or human's creativity enhancement. In contrast, there are numerous apps, software, and websites for this purpose of screen-based devices.

Robot's embodiment and movement
Arguably, the most salient affordance that gives social robots an edge over screen and voice assistants is their physical presence. The importance of physicality and movement in communication is evident from "body language" to 4E cognition, that is, the principle that all human cognition is embodied, embedded, enactive, and extended (Lindblom and Alenljung, 2015). In other words, people are not "brains in jars" but rely heavily on their bodies, the physical world around them, other humans, and their contexts to be able to think, communicate, and be fully human.
There have been studies on issues relevant to the design of robots with movement in mind like the study developed by Hoffman and Ju (2014). In the early stages of research in human-robot interaction researchers as Van Breemen (2004) understood that body gestures are a natural channel to communicate robot's agency and social behaviors. Furthermore, the motion of robot agents (mechanic and organic), as one of the main features differentiating robots from AI or computers, has been explored to highlight its relevance from robot esthetic and functional context by Harris and Sharlin (2011). Similarly, Bainbridge et al. (2011) explored how the physical presence of a robot affects human judgments of the robot as a social partner. Looking for the effects of form and motion in robotic agents, Castro-González et al. (2016) studied the attributions of animacy and investigated how the combination of robot's bodily appearance and movement can alter attributions of animacy, likability, trustworthiness, and unpleasantness in the users. Apparently, a Baxter robot executing mechanistic movement was perceived as inanimate. However, the same robot performing naturalistic movements was unpleasant. However, all these previous works have not been placed in the context of creative expressiveness of the robot agents.
Our work aims to explore and understand how animacy plays a role in the perception of social robots in the future design of interactive tasks related to creativity. According to the Oxford English Dictionary, animacy is ". . .the quality or condition of being alive or animate" (Animacy, 2021) while kinesthetics refers to ". . .the effort that accompanies a voluntary motion of the body" (Kinaesthesis, 2021). In the context of this research, kinesthetics and animacy refer to the study and perception of body motion, and the kinetic design refers to the use of movement as a design material (Sosa et al., 2015).

Games as a setup in HRI
We are interested to examine the physicality and moving affordances of social robots (their animacy qualities), and their potential advantages over traditional board games or interactive games such as mobile apps and voice assistants. It may be possible to combine the best of digital and physical affordances to design social robots that can meaningfully augment social games. To this end, studies are needed to assess the impact of the physical presence and movement of robots in these contexts of use.
Social and creative games represent interesting settings to study the interaction with social robots as they create a space for playful semi-structured interactions. In many social games, clear rules exist but significant open-endedness is supported to exercise and enjoy the creativity of oneself and others.
Creative games with open rules as Story Cubes have not been used often in HRI. However, games are a popular setup in HRI. For instance, Leite et al. (2009) used chess (to some extent a creative game) as a setup to understand how social presence of robots is perceived. Similarly, interactive storytelling in HRI has been reported as a promising scenario for children's social skills development (Leite et al., 2015(Leite et al., , 2017. Storytelling games have the potential to support creative social interactions. We thus select social games, and particularly storytelling games such as the popular "Story Cubes" (Ros and Demiris, 2013;Eladhari et al., 2014;Bae et al., 2016;Gordon and Spierling, 2018) as the site of research for this study.

Displaying robot creativity
Creativity is an important component of social interaction (Rogers, 1954). Movement has communicative properties that make it an essential part of social interaction between humans (Goldman, 2004), and it is widely regarded as central to embodied experiences including everyday creativity (Svanaes, 2013). Social robots assisting develop creative capacities that have been proposed before demonstrating the importance of physical movement in this type of applications (Hoffman and Ju, 2014). Similarly, storytelling is a creative and social activity that has an entertainment value but is also used to support learning (Sadik, 2008) and health (Plaisant et al., 2000).
Kinesthetic creativity has been studied mostly in artistic performance by humans (Ros and Demiris, 2013;Tan et al., 2018). To our knowledge, this work represents an early approach to the study of kinesthetic creativity in social robots. The research methodology for this study is an experimental design based on previous studies as the ones performed by Salem et al. (2011);Hoffman and Ju (2014);Hoffman et al. (2015); Tung (2016). The effects of robot movement are thus evidenced by their perception of how movement is perceived as a cue, indicating creative agency in social robots.

Motivation for this exploratory study
In this study, we chose Story Cubes as an experimental setup. This is a game where players create or re-create stories to share with others, thus requiring to some extent creative skills found universally (Brent, 2014). Our study asks whether social robots will have an edge based on their physicality and their basic animacy features to engage in creative activities such as storytelling games. More specifically, it seeks to assess to what extent movement may make a difference in such settings. If the answer to this question is positive, then further work will be needed to identify and evaluate the kinetic principles for the design of playful and creative social robots more specifically. If the answer is negative, then it is more likely that screens and voice interaction devices will be more adequate and arguably even easier to develop, deploy, maintain, and operate in this type of activities rather than robots that use physical bodies to move and reinforce non-verbal interactions. To this end, our study addresses the research gap in how robot movement is perceived when they perform a creative storytelling activity.
In sum, with studies like this, we aim to contribute to the emergent exploration of human-robot creative interaction (HRCI). Thus, here we set to identify the hypothesized connection between the perceived creative agency and the animacy of social robots. Our goal is to evaluate the relevance of robot movements to attribute creativity to social robots. At this stage, we aim to provide a benchmark for future experiments using non-choreographed robot movements. Similarly, we use storytelling games supported by visual cues to study how movement shapes human's perception of creative agency in robots. We need to highlight that most social robots present limitations in terms of dexterity when manipulating objects on a human scale.

Research goals and questions
This study aims to assess the connection between perceived creative agency and the animacy of social robots. We evaluate the relevance of robot movements to how observers attribute kinesthetic creativity to social robots and the verbal delivery of a story as a creative act. The aim of this experiment is to explore the extent to which robots' movement supports the display of a creative act.
With this exploratory study, we aim to respond the following research questions without proposing any hypotheses due to the complex nature of the interaction and the exploratory nature of the study: (1) To what extent do humans perceive creative agency in robots when these tell a story as part of a game (display of a creative act)?
(2) To what extent do humans perceive creative agency in robots when these display movements accompanying the delivery of the story? (3) How are robots perceived as creative agents compared with humans?

The robot
To evaluate the research questions, we programmed a Robotis Mini humanoid robot (See Figure 1A) to play our own version of Story Cubes (storytelling game) in a Wizard of Oz setup. The robot used a female voice in English language (Karen) generated in Mac OS 10.15.7 and played at slow speed (25%) using a bluetooth speaker next to the robot. We chose this robot due its small dimensions for transport and suitability to be used for future experiments using board games. The robot was animated using the Robotis Mini app in iOS 14.4. According to the work of Bernotat et al. (2021) and Kuchenbrandt et al. (2014), the robot design can lead to a male or female perception of the social robot. Hence, we decide to use a female voice to neutralize the possible male perception of the robot considering the very sharp and angular design of the mini humanoid. The robot movements were presented but not choreographed as they were just a stimulus to indicate robot animacy rather than supporting the delivery of the story.

The game
The Story Cubes game is a collaborative board game using six or more dice with adaptable rules and an indeterminate number of players. The goal of the game is to create a story with the contribution of all participants. We designed our Story Cubes for this experiment with a set of six oversized dice that could be visible in the video recordings. After five design iterations using different materials, we used a 30mm, solid white, PDA, 3Dprinted dice, which was laser engraved and hand painted. We modified publicly available icons under Creative Commons license for our experiment. Thirty six icons were engraved and six icons were used to reveal "The Ugly Duckling" story for the two different versions used in the experiment.

The story
We use the well-known literary tale titled "The Ugly Duckling" by the Danish author Hans Christian Andersen (Andersen et al., 1995) and a modified version of that story with an alternative ending. Stories are an effective way to connect with other humans and social agents. The creative act of telling a story associated with the movement could possibly lead to significant different perceptions by users of the creative agency of robots. Similarly, a universal story as The Ugly Duckling helps us to reach a significant and diverse pool of participants for this experiment. Finally, the twist plot leading to Frontiers in Robotics and AI frontiersin.org 04 a different end of the story is a creative act performed by a robot, which could be or not perceived by the user.

Setup
The standard version of The Ugly Duckling was split in six short sections to match with the six icons shown in Figure 1B and two additional sections for Introduction and Wrap-up. The creative story was deeply discussed by the authors and validated by four experts in creative writing, English literature, and film scripts in its early and latest versions. The latest version uses dinosaur references, which were considered unexpected and perceived as creative without the controversy and the risk of unintentionally offend or hurt feelings of a particular community as the ethics committee suggested (Application HC200985). This last version was validated by one of the specialists in English literature. Both versions, namely, the standard and the creative stories can be read as follows: Original Story:  Frontiers in Robotics and AI frontiersin.org 05 The robot, the dice, and the speaker (not visible) were allocated in a photo tent in order to isolate them from the external stimuli, avoid distractions for the users and make possible the replication of the experiment. A high contrast, red carpet was used to highlight the dice and the robot. After several tests, we decided to use the video, which was recorded in from the top-left corner of the photo tent using an iPhone 12 mini with 0.8x digital zoom. This setup allows to the viewers to have a full view of the robot and dice. We consider that using this point of view allows the user to understand the creation of the story by the robot. See Figure 2. The length of the videos was shorter when the standard story was displayed due to the three extra sentences used in the creative story. These three extra sentences in the creative story aim to reinforce the novel character of the story and assure that the participant noticed the plot twist. This setup was implemented by the suggestions of the experts in writing and literature.
Four videos were displayed to the participants showing two different robots performing a creative task. The creative task consists of storytelling performed by the robot. The method to tell the story is supported by visual cues in the form of icons on dice and the robot manipulating them. Once the bowl covering the dice was removed, the robot started one of the proposed stories with the movements depending on the conditions. In sum, we programmed a humanoid robot to tell the original version of the "The Ugly Duckling" and a modified version of the same story (creative story) under the experimental condition using cubes with visual imagery such as those used in the "Story Cubes" creative game. Under the control condition, the robot remains static while telling the story, while under the experimental condition the robot performs movements to accompany telling the story kinesthetically. These interactions were video recorded, and participants were requested to fill a survey evaluating the creative agency of the robot.

Experimental design
We designed a 2 × 2 between-subject online study. The factors are the story and robot movements. The robot can tell the standard story or the creative story (dinosaur plot twist), and the robot can display movements or not (still) during the storytelling. Hence, we test four conditions displayed by the robot: still robot and standard story (SS), still robot and creative story (SC), moving robot and standard story (MS), and moving robot and creative story (MC). As a between-subject study, the participants of this online study were exposed to one of the conditions mentioned earlier, that is, a participant under the condition MC would see a robot gesticulating and telling The Ugly Duckling story with the plot twist of the dinosaur. The dice and videos of this experiment can be requested contacting the corresponding author. The videos are unlisted in YouTube but can be reviewed using the next links: Video MS condition, Video MC condition, Video SC condition, and Video SS condition. See Table 1.

Online survey and measurements
This study was approved by the human ethics committee of the University of New South Wales, application HC200985 reviewed by the HREAP Executive. Similarly, this study was funded using the Scientia Fellowship (PS-46183) development package provided by the UNSW. We implemented the survey on Qualtrics licensed for the UNSW. A survey flow was created to assign the participants under the four conditions randomly, see Figure 3. The survey was distributed using prolific.co. The survey was designed as follows: First, the participant information and then the consent form. Once the participant agreed to participate in the studio, he/she was directed to the next page, and demographic information and confirmation of the Prolific ID was collected. The identity of the participants is anonymous, and demographic information was collected such as age, gender, occupation, location, and level of education.
Once the demographic information was collected, the Short Scale of Creative Self (Karwowski et al., 2018) questionnaire was applied. The questionnaire is confirmed by eleven questions measuring creative self-efficacy (CSE) and creative personal identity (CPI). The questions are as follow: (1) I think I am a creative person (CPI).
(2) My creativity is important to who I am (CPI).  Frontiers in Robotics and AI frontiersin.org 06 (6) Many times I have proven that I can cope with difficult situations (CSE). (7) Being a creative person is important to me (CPI). (8) I am sure I can deal with problems requiring creative thinking (CSE).
(9) I am good at proposing original solutions to problems (CSE). (10) Creativity is an important part of me (CPI). (11) Ingenuity is a characteristic which is important to me (CPI).

FIGURE 3
Experimental procedure. We applied the questionnaire using Qualtrics. We distributed it using Prolific.co.
Frontiers in Robotics and AI frontiersin.org 07 Questions one, two, seven, ten, and eleven gauge CPI and questions three, four, five, six, eight, and nine are assigned to the CSE. Following the questions, the participants watched one of the four videos. We confirmed that the video was watched via a check button and requesting a brief description of the video by the participant. Next, the two first authors proposed a modified version of the SSCS to be applied to robots playing creative task. In this case, the task is framed in the playing of our version of Story Cubes. As the original SSCS, our Likert scale goes from one to five. The anchors are definitely not 1) and definitely yes (5). We will validate this adapted scale in a following article. The questions proposed are listed as follow.
(1) I think that the way the robot played Story Cubes shows it is a creative robot. In addition, we added a 12th question. The aim of this question is to summarize the general impression of the participant in just one score. The Godspeed questionnaire was applied after the SSCS and general impressions over the study were requested to the participant. The survey concludes confirming the end and providing a code that will allow the compensation for the participant by Prolific. The content of the survey is available by request to the corresponding author.

Participants
A total of 297 participants were recruited using the platform Prolific. All participants were over 18 years, and there were no restrictions on gender, formal education, income, or other demographics. Due to technical issues, just 242 participants were exposed to one of the four conditions and compensated by completing the questionnaire. We used the data of 239 participants as the SSCS score as three of their data were not recorded. Participants were paid the equivalent of 1.27 British pounds (or 2.2 Australian dollars) for their participation. The first half of the study was run in February 2021, and the second part was 1 week later. One fourty five participants were male, 89 female, three non-binary, and two did not specify. The average age was 26.9 years (SD = 8.58). Participants came from a range of locations; 38.5% from North America (Canada, United States, and Mexico), 34.7% from Europe, 13% from the United Kingdom, 7.5% from South America, 4.2% from Oceania, and 2.1% from Africa. The education levels are distributed as follows: 42.7% university degree, 38.5% high school, 9.6% masters, 1.7% PhD, 5.4% vocational education, 1.7% primary education, and 0.4% other. In terms of occupation, 42.3% were students, 38.9% employed (6.3% IT and software, 3.3% artist, 2.5% freelance, 1.3% researcher, and 25.5% other), 12.1% were unemployed, 2.5% homemakers, 0.4% retired, and 1.3% did not specify.
The average time to fill the survey was 9 min 20 s. We suggested to the participants to use a device with a large screen to have a better user experience and show the video and survey consistently; 72.8% used Windows 10, 8.4% used Macintosh, and 18.8% used other platform (Android 10 5.4%, iPhone 4.2%, Windows 6.1 3.8%, Windows 6.3 1.7%, Android 11 1.3%, Android 9 1.3%, Linux x86 0.8%, and Ubuntu 0.4%).
We aimed to allocate at least 60 participants per condition. We did not fully record data as three subjects lost in the SSCS score. Participants were randomly allocated as follows: (SS) = 62 participants (minus two missed SSCS scores), SC = 61 participants, MS = 61 participants (minus one missed SSCS score), and MC = 58 participants. The average human SSCS score was 3.89 (SD = 0.66), no significant differences were found among the different experimental conditions (ANOVA). Participant's results are stored in a standard online spreadsheet, and the statistical analysis was made using IBM SPSS.

Results
In order to address the exploratory questions of this study, we performed multiple 2 × 2 factorial analyses of variance (ANOVA); the factors were the story (standard vs. creative) and movements (still vs. moving). We defined seven dependent variables: the SSCS, the question twelve, and the five Godspeed items (anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety). In addition, we performed a Pearson's correlation present among the humans and robots CPI, CSE, and SSCS scores to check the internal consistency and possible human-robot significant correlations among the scores.
Frontiers in Robotics and AI frontiersin.org 08 5.1 Short Scale of Creative Self (SSCS) score applied to the robots This variable is our main indicator to assess how participants perceive robots as creative agents. The average SSCS score of the robots under the four different conditions are as follows: SS = 3.48 (SD = 0.74), SC = 3.66 (SD = 0.61), MS = 3.58 (SD = 0.68), and MC = 3.53 (SD = 0.80). As the original SSCS, our Likert scale goes from one to five. The anchors are definitely not 1) to definitely yes (5).
We ran a two-way ANOVA with the SSCS score as a dependent variable and the story and movements as factors. Residual analysis was performed to test for the assumptions of the two-way ANOVA. The assumption of homogeneity of variances was not violated, as assessed by Levene's test for equality of variances, p = 0.122. Data were normally distributed as assessed by the Kolmogorov-Smirnov test (p = 0.200). There were six outliers as assessed as being located less than 3 box-lengths from the edge of the box in a boxplot. They were not removed from the following ANOVA and showed that they did not affect the results even without them. Neither significant main effects nor interaction effects were found. See Figure 4. Similarly, pairwise comparisons were run aiming to find simple main effects. However, non-significant effects were found again.
Hence, as suggested in Laerd (2021), a further robust ANOVA was run. We used SigmaPlot to run a Kruskal-Wallis non-parametric ANOVA. Once again, neither main effect nor interaction effect was found. KW = 2.11, df = 3, and p = 0.550.

Question twelve and Godspeed items
A similar procedure was followed for the analysis of the rest of the dependent variables. The effects were found only for the items of animacy, likeability, and perceived safety. Data were normally distributed, as assessed by the Kolmogorov-Smirnov test for animacy (p = 0.200) and likeability (p = 0.200) but not for perceived safety (p <0 .001) and Q12 (p <0 .001). The assumption of homogeneity of variances was not violated for animacy (0.989), perceived safety (p = 0.173), and Q12 (p = 0.909) as assessed by Levene's test for equality of variances. However, it failed for likeability (p = 0.006).
We decided to maintain the outliers assessed as those being located less than 3 box-lengths from the edge of the box in a boxplot. The results indicate that movement has a main effect on how participants perceive the robots under the different conditions. Animacy F (3,235) = 39.777, p <0 .001, likeability F (3,235) = 12.824, p <0 .001, and perceived safety of the robot F (3,235) = 19.127, p <0 .001. Due to the violation of the assumption of homogeneity for likeability, a robust nonparametric Kruskal-Wallis ANOVA was run for this variable. Movement has a significant main effect. KW = 10.205, df = 3, and p = 0.017. See Table 2.

Person's correlations among CPI, CSE, and SSCS in humans and robots
A Pearson's 1-tailed correlation was carried out to assess the relationship between the human CPI, CSE, and SSCS scores and Boxplot per group of the SSCS applied to the robot with outliers. No significant differences were found among the means of the groups.
Frontiers in Robotics and AI frontiersin.org 09 similar scores granted to the robot. We aimed to find if correlations between people perceiving themselves as highly creative project this on the robot's creative act. The scores of the 239 participants were analyzed. There were statistically significant, moderate, and strong positive correlations between CPI-CSE, SSCS-CSE, and SSCS-CPI in humans, and similar pattern of correlations among the robot scores suggesting consistency in our proposed scale and its internal scores. However, no significant correlation appeared between human scores and robot scores. We should highlight that overall the means of the human scores were higher than that of the robot scores under all the experimental conditions and all the scores. See Table 3.

Discussion
This study aims to explore the design of future social robots using a quantitative approach and qualitative insights from the users. With this study, we want to encourage a discussion in the domain of robot's perceived creativity and explore robot movement supporting the delivery of a creative act. However, studying creativity using quantitative approaches presents a number of challenges. We frame our findings in the context of games as a mean to sustain long-term, meaningful human-robot creative interactions and several considerations should be taken.
To answer the first research question: to what extent do humans perceive creative agency in robots when they tell a story as part of a game (display of a creative act)? We assured that participants watched the video requesting a checkbox validation and a brief description of what they saw. Participants frequently used the word "story" or even specifically "ugly duckling story" in these descriptions; 92.5% of the participants mentioned the word "story" or similar (tale, history, story line, and fable) when describing the video. In few cases typos were present and derivations such as history, story, or study were used but the intention was taken as the same. Furthermore, 100% of the  participants referred directly or indirectly to the act of telling a story even when they did not use the word in their descriptions. For instance, "it is about a robot that tells ugly duck that turns into swan," "the process to find the real beauty," or "It is about a robot that tells ugly duck that turns into swan." We highlight the fact that the SSCS scores of the robots, even when they are not significantly different, were above the mean (2.5) of the 1-5 scale of the SSCS score as indicated in section 5.1. Even though the robot scores were lower that the human scores, the minimal score was above 3.4 for the still robot standard story condition. The scores per condition were: SS = 3.48 (SD = 0.73), SC = 3.66 (SD = 0.61), MS = 3.58 (SD = 0.68), and MC = 3.53 (SD = 0.80). Hence, we can claim that participants were aware that the robot was performing a creative act delivering a story. Future work can involve further statistical analysis comparing with a specific benchmark, indicating a minimal score that a social agent considered for a creative agent. Similarly, a face-toface setup could be more appropriate to perform an experiment of this nature since it is likely that the robot embodiment has a significant impact in the participant's perceptions compared to virtual agents.
For the second question: to what extent do humans perceive creative agency in robots when these display movements accompanying the delivery of the story? We considered that robot movement would be a variable moderating the effect of the story on how people perceive robots. The marginal means graph can wrongly lead to conclude that movements moderate the SSCS score. However, when inspecting the boxplot, it is clear that means among all the experimental conditions are not significant. See 4. Although we did not notice main or interaction effects in the SSCS score, we did notice significant effects in three of the items of the Godspeed scale. These items are animacy, likeability, and perceived safety.
In the case of animacy, we observed that participants in this study could notice the movements of the robot as they score significantly highly in animacy to the moving robots independently of which story the robot is telling. This shows that participants are aware of the movement and how the movement impacts participant's perceptions in terms of likeability as they rank moving robots MS = 3 (0.68) and MC = 2.84 (0.66) significantly higher than still robots SS = 2.36 (0.76) and SC = 2.33 (0.74). See Table 2.
The robot movements were not mentioned frequently in the description of the video as the story. However, some participants used anthropomorphic terms, that is, "a creepy robot gives its version of The Ugly Duckling by Hans Christian Andersen. It also does a decent MC hammer impression.". . .The movements of the robot were however a bit erratic and did not match that well with the story it was telling. "In a way I feel like I was arranged to tell that story but I liked the movements and the appearance of the robot." In terms of likeability, as shown in Table 2, participants scored significantly higher to the moving robots in terms of likeability. MS = 3.96 (0.62) and MC = 3.096 (0.62) significantly higher than still robots SS = 3.49 (0.96) and SC = 3.61 (0.96). The significant main effect of the movements in likeability is aligned with previous studies using games (Sandoval et al., 2016b;a, 2020). Apparently, humanoid robots tend to be likeable when they perform unexpected tasks that can be interpreted as social or creative. Future studies could test other robot embodiment perceived as less humanoid. An illustrative comment in how some participants perceive the robot was: "how this robot looks like and how it feels like compared to a human being. By his movements, he looked really happy, but by talking only sometimes we cannot understand how smart a robot can be. He was very smart, and we can definitely see that his voice was not recorded at some point." At the beginning of this experiment we considered that the factor of movement would be a moderating variable, supporting the delivery of the story by the robot. In other words, robot movement would lead to higher scores for the robots in both kinds of stories or at least in one of the stories. However, evidence for this were not registered, a possible reason for that is the type of the robot's movement. For this experiment, we intentionally designed a set of robot movements that are part of the standard programming of the robot but are not synchronized with the delivery of the story. The reason for this is that in future applications using robots for social games, it is unlikely that robot movements will always be customized to their dialogs. Better synchronized and choreographed movements would be an obvious stimulus that could cause significant main effects in the SSCS score and how people perceive robots as creative agents. This type of movements can be tested in future studies. In terms of perceived safety, moving robots were perceived as safer than still robots. This is a result worth considering further, especially taking into account that this study is an online experiment and not a face-to-face setup.
Question twelve in the survey: "I think that the robot can consistently create good stories when playing Story Cubes" was added to allow the participants to summarize their impressions from the previous SSCS questions. The movement main effect was not significant (p = 0.64) when the participants answered this question. However, this provides an intriguing result to be explored in future studies as the comments of the participants suggest. Certainly, the participants perceived the movement of the robot but not necessarily the novelty of the story. Independently of the story, participants rated slightly higher for the moving robots than still robots as agents, which can create good stories when playing Story Cubes. Even as a marginal result, this finding points to the importance to study a range of robot movement approaches in future work. Further qualitative analysis is required to explore.
For question three in the survey: how are robots perceived as creative agents compared with humans? In all the experimental conditions robots scored lower than humans for the SSCS scores. Robot scores can be seen in Section 5.1 and human SSCS scores Frontiers in Robotics and AI frontiersin.org 11 are as follow SS = 3.88(SD = 0.68), SC = 3.85 (SD = 0.83), MS = 3.88 (SD = 0.70), and MC = 3.95 (SD = 0.66). Pearson correlation does not suggest any significant correlation among the human and robot scores. However, there are significant correlations among the sub-scores CPI and SCE in both human and robot scores, which could suggest that the internal consistency of the original scale and our adaptation for assessing robot creativity have potential as measurement tools for future studies in human-robot creative interactions. See Table 3.
Some of the participants' comments suggest that a further exploration of this question could be relevant as they were enthusiastic about the capabilities of the robots comparing with previous human creative experiences. To illustrate, first of all, the participants' adored the robot. they thought it was cute plus they were impressed with its abilities, too! they meant they can probably kind of guess how it works, but still exclaimed that it was just mind-blowing! The participants' loved that the message of the story told by the robot was so wholesome! As a musician and somewhat a songwriter, they find it astonishing that how it can come up with a good story in such a quick amount of time, and they wished keep it up. As for the study itself, they liked the way the text lighted up when the mouse hovered over it, they have not seen it a lot, if anytime. In addition, they had to check up on two of the English words used in the study, which they highly value as an educational feature. They thoroughly enjoyed the experience. "This was an interesting concept to consider, and the participants' would honestly like to see more content involving AI and Story Cubes." They are impressed how robot can tell people a story based on random images.

Conclusion
Creativity can be considered an aspect of autonomy and agency in social agents that is different from intelligence, logic, and strategy. The current understanding on how creativity is displayed by robots is still limited. This study aimed to inform the design possibilities of human-robot creative interaction (HRCI) and provides a reference for future studies exploring the main factors involved in the creative interaction between humans and robots. Our findings show that the setup used in this study does not trigger higher scores in the SSCS, differentiating the robots as creative agents. However, movement does show main effects in the scores of animacy, likeability, and perceived safety of the Godspeed scale. Furthermore, the scores of the moving robots were above the media in all the cases (although they were lower than the SSCS scores in humans). In terms of how robots are perceived as good storytellers (question 12 in the robot SSCS), even when the scores are not significantly different, the results provide an important insight in how to continue the development of future experimental studies, for instance, the need to perform similar study in a face-to-face setup and using other robot embodiments beyond humanoid robots.
The study of creativity in robots shows a research gap when addressed in playful, creative, and collaborative activities such as board games. We chose a playful task (a storytelling game) to empirically evaluate the extent to which a robot's physical embodiment may cause humans to attribute creative agency to a robot. The Story Cubes game offers a means to further assess the display of creativity in robots considering the applicability of this setup as entertainment and for the development of cognitive skills, spacial memory, decision making, and collaborative skills (Wu et al., 2012;Unbehaun et al., 2019).
Furthermore, we consider that our approach is useful in aiming strategies for long-term interaction and as an alternative to avoid screen addiction and contribute to a better mental health in the digital age (Aboujaoude and Starcevic, 2015;Sandoval, 2019). Even in the Story Cubes mobile app, the user experience is visibly compromised when compared to the physical cubes. Considering this, we highlight the importance of perceived creativity in social robots to further develop the early work in artificial creativity. It looks like it is critical to explore advantages and disadvantages among robotic interfaces that display and support creative interactions. To this end, when users are exposed to stimuli related to creative robots it seems critical to set their expectations in this type of studies. One participant said, for instance: "the study itself was fine. The premise, however, is something that has not been fully explained." For example, is this a prototype of a children's toy? Is it a learning device? Is it a diagnostic tool?" Finally, one of the comments of the participants is encouraging to continue the development of studies in creative robots playing storytelling games. Assuming the robot really came up with the story using its own creativity and it was not programmed into it, the participant was very impressed with the level of depth which the story had. In that sense, the robot could even be more creative than a lot of humans. The participant also thinks that its use of words and its manner of speech is properly human-like, that is not to say, we could not feel the robotic nature behind it at all. In my opinion, some work should be performed on the robot's movement, and how it connects to whatever it is saying in a way that makes more sense. The participant wished us good luck with the study and with further development of creative robots."

Limitations and future work
The main technical limitations to implement a robot board game have been discussed before by Sandoval et al. (2021). The game Story Cubes in particular has a significant element of improvisation and randomness, and the translation of the cubes to create a consistent story is a technical challenge for robots that genuinely synthesize stories from these stimuli. Similarly, the vision system required to read the cubes accurately under a range of lighting conditions and angles Frontiers in Robotics and AI frontiersin.org would require significant technical work. We are currently working on an implementation where participants and robots play Story Cubes in a shared physical setup for a future study. Choreographic movements of the robots will be programmed and displayed on a bottom-up strategy that starts by incorporating movements in an increasing level of sophistication and detail. Then, we will compare with the random movements of this experiment. We consider that this would inform robot designers to incorporate movement that achieves a balance between creating a meaningful human-robot creative interaction (HRCI) while drawing from a library of gestures, postures, and body movements suitable for fluid communication. In terms of data collection and analysis, we plan to conduct a thematic analysis of participants' comments. Furthermore, future experimental designs could include different robot embodiments (humanoid vs. non-humanoid), variants of Story Cube games, and different stories (original stories vs. wellknown stories). Finally, a more exhaustive validation of our version of the SSCS (performing factorial and reliability analysis) may be required to assess future interaction face-toface setup in a more robust manner.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding author.