Cost-Efficient Power Orchestration in Electric Vehicle-Integrated VPP Using Lightweight Multi-Agent Reinforcement Learning

Research output: Contribution to journalArticlepeer-review

Abstract

Virtual power plants (VPPs) have emerged as an advanced solution for coordinating distributed energy resources (DERs), including the stored energy of electric vehicles (EVs). The substantial demand for EV charging imposes significant stress on the electrical grid, resulting in elevated energy costs for operators. On the other hand, the advent of reversible charging technologies offers a promising method to harness the surplus energy from EVs that do not require immediate charging. In this study, we introduce the concept of EV-integrated VPP in place of the traditional charging station. By designing a tailored mathematical model, we optimize the charging and discharging schedule, termed optimal power orchestration, which aims to minimize the energy costs as well as EV battery degradation. We further design a lightweight multi-agent reinforcement learning (MARL) based approach to tackle the optimal power orchestration problem by reformulating it as a decentralized partially observable Markov decision process (Dec-POMDP). Meanwhile, knowledge distillation is also incorporated into the proposed method to enable efficient deployment in such a distributed resource-constrained environment. Through extensive experiments utilizing real-world EV charging data and realistic scenario settings, our findings demonstrate significant reductions in energy costs and battery degradation by 15.5% and 71.1%, respectively, compared to the baseline method.

Original languageEnglish
JournalIEEE Transactions on Intelligent Transportation Systems
DOIs
Publication statusAccepted/In press - 2025
Externally publishedYes

Keywords

  • demand charge
  • knowledge distillation
  • reinforcement learning
  • VPP

Fingerprint

Dive into the research topics of 'Cost-Efficient Power Orchestration in Electric Vehicle-Integrated VPP Using Lightweight Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this