Explaining Deep Q-Learning Experience Replay with SHapley Additive exPlanations

Robert S. Sullivan, Luca Longo

Research output: Contribution to journalArticlepeer-review

Abstract

Reinforcement Learning (RL) has shown promise in optimizing complex control and decision-making processes but Deep Reinforcement Learning (DRL) lacks interpretability, limiting its adoption in regulated sectors like manufacturing, finance, and healthcare. Difficulties arise from DRL’s opaque decision-making, hindering efficiency and resource use, this issue is amplified with every advancement. While many seek to move from Experience Replay to A3C, the latter demands more resources. Despite efforts to improve Experience Replay selection strategies, there is a tendency to keep the capacity high. We investigate training a Deep Convolutional Q-learning agent across 20 Atari games intentionally reducing Experience Replay capacity from (Formula presented.) to (Formula presented.). We find that a reduction from (Formula presented.) to (Formula presented.) doesn’t significantly affect rewards, offering a practical path to resource-efficient DRL. To illuminate agent decisions and align them with game mechanics, we employ a novel method: visualizing Experience Replay via Deep SHAP Explainer. This approach fosters comprehension and transparent, interpretable explanations, though any capacity reduction must be cautious to avoid overfitting. Our study demonstrates the feasibility of reducing Experience Replay and advocates for transparent, interpretable decision explanations using the Deep SHAP Explainer to promote enhancing resource efficiency in Experience Replay.

Original languageEnglish
Pages (from-to)1433-1455
Number of pages23
JournalMachine Learning and Knowledge Extraction
Volume5
Issue number4
DOIs
Publication statusPublished - Dec 2023

Keywords

  • SHapley Additive exPlanations
  • deep reinforcement learning
  • eXplainable artificial intelligence
  • experience replay

Fingerprint

Dive into the research topics of 'Explaining Deep Q-Learning Experience Replay with SHapley Additive exPlanations'. Together they form a unique fingerprint.

Cite this