9:40 AM - 10:00 AM
[2G1-GS-11-03] Targeted Manipulation Attacks on Reinforcement Learning Agents through Imitation Learning with Perturbed Observations
Keywords:Deep Reinforcement Learning, Adversarial attack, Generative adversarial networks
Deep reinforcement learning (DRL) is known to be vulnerable to adversarial attacks. For real-world applications, it is necessary to improve the robustness of DRL agents. Therefore, in this study, we propose a targeted manipulation attack method that specifies the behavior of the victim agent assuming a real-world attack in order to investigate the vulnerability. As a threat model, we consider a situation in which an attacker can generate perturbations to the observations of victim agent. The goal of the attacker is to manipulate the victim agent. The attacker expresses the desired behavior as a trajectory and attacks the victim agent to imitate it. In this study, we use imitation learning to realize the attack. Finally, we confirm that the targeted manipulation attack succeeds under the threat model set by our experiments on MetaWorld, a benchmark for reinforcement learning.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.