AUTHOR=Zehfroosh Ashkan , Tanner  Herbert G. 

TITLE=A Hybrid PAC Reinforcement Learning Algorithm for Human-Robot Interaction

JOURNAL=Frontiers in Robotics and AI

VOLUME=Volume 9 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2022.797213

DOI=10.3389/frobt.2022.797213

ISSN=2296-9144

ABSTRACT=This paper offers a new hybrid probably approximately correct (PAC) reinforcement learning (RL) algorithm for Markov decision processes (MDPs) that intelligently maintains favorable features of both model-based and model-free methodologies. The designed algorithm, referred to as the Dyna-Delayed Q-learning (DDQ) algorithm, combines model-free Delayed Q-learning and model-based R-max algorithms while outperforming both in most cases. The paper includes a PAC analysis of the DDQ algorithm and a derivation of its sample complexity. Numerical results are provided to support the claim regarding the new algorithm’s sample efficiency compared to its parents as well as the best-known PAC model-free and model-based algorithms in application. A real-world experimental implementation of DDQ in the context of pediatric motor rehabilitation facilitated by infant-robot interaction highlights the potential benefits of the reported method.