General Framework for the Optimization of the Human-Robot Collaboration Decision-Making Process Through the Ability to Change Performance Metrics

This paper proposes a new decision-making framework in the context of Human-Robot Collaboration (HRC). State-of-the-art techniques consider the HRC as an optimization problem in which the utility function, also called reward function, is defined to accomplish the task regardless of how well the interaction is performed. When the performance metrics are considered, they cannot be easily changed within the same framework. In contrast, our decision-making framework can easily handle the change of the performance metrics from one case scenario to another. Our method treats HRC as a constrained optimization problem where the utility function is split into two main parts. Firstly, a constraint defines how to accomplish the task. Secondly, a reward evaluates the performance of the collaboration, which is the only part that is modified when changing the performance metrics. It gives control over the way the interaction unfolds, and it also guarantees the adaptation of the robot actions to the human ones in real-time. In this paper, the decision-making process is based on Nash Equilibrium and perfect-information extensive form from game theory. It can deal with collaborative interactions considering different performance metrics such as optimizing the time to complete the task, considering the probability of human errors, etc. Simulations and a real experimental study on “an assembly task” -i.e., a game based on a construction kit-illustrate the effectiveness of the proposed framework.


ENTIRE EXAMPLE OF A SIMULATED TEST ON THE ASSEMBLY TASK
In the main paper, we presented the best-simulated results for illustrating the percentage of the time improvement and the percentage of the reduction of the number of human errors. In this section, we want to present both results for the same simulated test as an example. We consider a 3-cube puzzle with a ratio of the time taken by each agent (the human h and the robot r) to make an action equals to: t A h /t Ar = 1/3. This ratio is the same as the one we had while doing the real experiment with Nao and a human participant. We define P (A h,g ) = I 1 as the probability that the human does the good action, P (A h,w ) = I 2 as the probability that the human is passing their turn and, I 3 = 1 − (I 1 + I 2 ) as the probability that the human makes an error.
We note from Figure S2 that the percentage of the time improvement using our utility function (C 3 ) instead of the state-of-the-art one (C 1 ) for this puzzle is up to 40%. From Figure S3, the percentage of human errors reduction for the same puzzle is up to 27.9%. In both figures, each dotted line is equivalent to a specific I 1 value. Each dot corresponds to a I 2 value (read on the x-axis). For each dot knowing I 1 and I 2 , we can deduce its I 3 value using I 3 = 1 − (I 1 + I 2 ).
We have performed a lot of simulated tests, the results of which can be found on: https://github. com/MelodieDANIEL/Optimizing_Human_Robot_Collaboration_Frontiers. Figure S1. Implementation of the conducted experiments using ROS Figure S2. Percentage of time improvement between C 3 and C 1 for a 3-cube puzzle. t A h = {15, 0, 15} and t Ar = {45, 0, 45}, so the ratio t A h /t Ar = 1/3. Figure S3. Percentage of human errors reduction between the predicted probability of human errors and the measured one for a 3-cube puzzle. t A h = {15, 0, 15} and t Ar = {45, 0, 45}, so the ratio t A h /t Ar = 1/3.

RESULTED TABLE OF SIMULATION TESTS
In this section, we present the resulted table (Table S2) of the percentage of the time improvement and the reduction of the number of human errors for all the figures presented in the main paper and the supplementary material. Table S3 presents all the computation and execution times of the experiments in real and in simulation. As we can notice, the average computation time of our decision-making framework is 0.5s. This computation time is suitable for the targeted real tasks on which we want to apply this framework.  The workload required for the human to adapt to the robot Self-awareness

COMPUTATION AND EXECUTION TIME OF THE TESTS
The robot knows its accuracy Autonomy The robot autonomy Table S1. Some metrics considered for the evaluation of HRC classified based on the task types (Steinfeld et al., 2006;Bütepage and Kragic, 2017;Nelles et al., 2018) Percentage of time improvement Percentage of human errors reduction I 1 , I 2 , and I 3 values Figure 6 (main paper) Figure

Frontiers
Step Time in seconds Real tests Computation time of the decision-making (applying our formalization) The robot takes an average of 0.5s to choose the action to perform after knowing the state of the puzzle through the perception part.
Time taken by the robot for the perception of the puzzle state The robot takes between 20s and 30s depending on how well the cubes are placed and how many cubes are left to assemble.
Time taken by the robot for doing a physical movement The robot takes on average 15s for doing a physical movement.
Waiting time for the robot when it gives an indication for the human The robot waits between 5s and 15s each time it gives an indication to the human, depending on its complexity (for example, to ask the human to remove a cube, the robot waits 5s, and to ask the human to take a certain cube and place it in a certain position, the robot waits 15s).
Global time taken by the robot to perform an action It is between 20s and 60s, depending on the complexity of the movement (the number of cubes left to assemble at this iteration) and if the robot gives indications to the human. We considered that it was 60s.
Global time taken by the human to perform an action The human takes between 1s and 30s, depending on the complexity of the movement (if they know what to do or not). We considered that it was 20s.

Tests in simulation
Time required for all probability distributions of possible human actions without printing the figures (such as Figures S2 and S3) The Python code takes between 80s and 100s on a Dell laptop with an Intel Core i7 CPU and 32GB RAM.