AUTHOR=Hoffman Robert R. , Jalaeian Mohammadreza , Tate Connor , Klein Gary , Mueller Shane T. 

TITLE=Evaluating machine-generated explanations: a “Scorecard” method for XAI measurement science

JOURNAL=Frontiers in Computer Science

VOLUME=Volume 5 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2023.1114806

DOI=10.3389/fcomp.2023.1114806

ISSN=2624-9898

ABSTRACT=In the recent research on Explainable AI (XAI), most systems provide explanations that are just clues or hints about the computational models—Such things as feature lists or saliency images. However, a user might want answers to deep questions such as How does it work?, Why did it do that instead of something else? What things can it get wrong? Reported examples of machine-generated explanations can be placed on a scale reflecting depth of explanation, that is, the degree to which explanations support the user's sensemaking. The seven levels of this scale form the Explanation Scorecard. This article presents the Scorecard and the method by which it was developed and then validated. It was utilized in an analysis of recent literature, showing that many systems still present low-level explanations. The Scorecard can be used by developers to conceptualize how they might extend their machine-generated explanations to support the user in developing a mental model that instills appropriate trust and reliance. The article concludes with recommendations for how XAI systems can be improved with regard to the cognitive considerations, and recommendations regarding the manner in which results on the evaluation of XAI systems are reported.