Optimizing Deep Q-Learning Experience Replay with SHAP Explanations: Exploring Minimum Experience Replay Buffer Sizes in Reinforcement Learning

Robert S. Sullivan, Luca Longo

Research output: Contribution to journalConference articlepeer-review

Abstract

Explainable Reinforcement Learning (xRL) faces challenges in debugging and interpreting Deep Reinforcement Learning (DRL) models. A lack of understanding for internal components like Experience Replay, which samples and stores data from the environment, risks burdening resources. This paper presents an xRL-based Deep Q-Learning (DQL) system using SHAP (SHapley Additive exPlanations) to explain input feature contributions. Data is sampled from Experience Replay, creating SHAP Heatmaps to understand how it influences the neural network Q-value approximator's actions. The xRL-based system aids in determining the smallest Experience Replay size for 23 simulations of varying complexities. It contributes an xRL optimization method, alongside traditional approaches, for tuning the Experience Replay size hyperparameter. This visual and creative approach achieves over 40% reduction in Experience Replay size for 18 of the 23 tested simulations, smaller than the commonly used sizes of 1 million transitions or 90% of total environment transitions.

Original languageEnglish
Pages (from-to)89-94
Number of pages6
JournalCEUR Workshop Proceedings
Volume3554
Publication statusPublished - 2023
EventJoint 1st World Conference on eXplainable Artificial Intelligence: Late-Breaking Work, Demos and Doctoral Consortium, xAI-2023: LB-D-DC - Lisbon, Portugal
Duration: 26 Jul 202328 Jul 2023

Keywords

  • Deep Reinforcement Learning
  • Experience Replay
  • SHapley Additive exPlanations
  • eXplainable Artificial Intelligence

Fingerprint

Dive into the research topics of 'Optimizing Deep Q-Learning Experience Replay with SHAP Explanations: Exploring Minimum Experience Replay Buffer Sizes in Reinforcement Learning'. Together they form a unique fingerprint.

Cite this