Skip to main navigation Skip to search Skip to main content

Explaining Deep Q-Learning Experience Replay with SHapley Additive exPlanations

  • Robert S. Sullivan
  • , Luca Longo

Research output: Contribution to journalArticlepeer-review

Abstract

Reinforcement Learning (RL) has shown promise in optimizing complex control and decision-making processes but Deep Reinforcement Learning (DRL) lacks interpretability, limiting its adoption in regulated sectors like manufacturing, finance, and healthcare. Difficulties arise from DRL’s opaque decision-making, hindering efficiency and resource use, this issue is amplified with every advancement. While many seek to move from Experience Replay to A3C, the latter demands more resources. Despite efforts to improve Experience Replay selection strategies, there is a tendency to keep the capacity high. We investigate training a Deep Convolutional Q-learning agent across 20 Atari games intentionally reducing Experience Replay capacity from (Formula presented.) to (Formula presented.). We find that a reduction from (Formula presented.) to (Formula presented.) doesn’t significantly affect rewards, offering a practical path to resource-efficient DRL. To illuminate agent decisions and align them with game mechanics, we employ a novel method: visualizing Experience Replay via Deep SHAP Explainer. This approach fosters comprehension and transparent, interpretable explanations, though any capacity reduction must be cautious to avoid overfitting. Our study demonstrates the feasibility of reducing Experience Replay and advocates for transparent, interpretable decision explanations using the Deep SHAP Explainer to promote enhancing resource efficiency in Experience Replay.

Original languageEnglish
Pages (from-to)1433-1455
Number of pages23
JournalMachine Learning and Knowledge Extraction
Volume5
Issue number4
DOIs
Publication statusPublished - Dec 2023

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 8 - Decent Work and Economic Growth
    SDG 8 Decent Work and Economic Growth
  2. SDG 12 - Responsible Consumption and Production
    SDG 12 Responsible Consumption and Production

Keywords

  • SHapley Additive exPlanations
  • deep reinforcement learning
  • eXplainable artificial intelligence
  • experience replay

Fingerprint

Dive into the research topics of 'Explaining Deep Q-Learning Experience Replay with SHapley Additive exPlanations'. Together they form a unique fingerprint.

Cite this