Policy Teaching by Agent-Based Intervention in Reinforcement Learning Processes

Mikoto Kudo

1:50 PM - 2:10 PM

[2F4-GS-5-02] Policy Teaching by Agent-Based Intervention in Reinforcement Learning Processes

〇Mikoto Kudo^1,2, Youhei Akimoto^1,2 (1. Tsukuba University, 2. RIKEN Center for Advanced Intelligence Project)

Keywords:Multi-agent, Policy Teaching

Autonomous learning agents using online reinforcement learning learn strategies sequentially from state observations obtained from interactions with the environment and internally defined rewards. However, if the state transition changes due to the intervention of other agents, the agent may not be able to learn the strategy it originally wanted to learn or may be induced to learn a specific strategy. In this study, we propose an intervention algorithm and investigate its properties for such an intervention attack on the reinforcement learning process. We formulate the intervention by the intervention agent to the protagonist agent as a 2-player Markov Game, and find that when the protagonist is induced to learn a strategy that maximizes the reward intended by the interventionist, the intervention can fail even in situations where the protagonist always obtains the optimal strategy for his reward. Another problem arises in situations where the protagonist is in the process of learning, for which we devised an improved algorithm.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2F4-GS-5] Agents

[2F4-GS-5-02] Policy Teaching by Agent-Based Intervention in Reinforcement Learning Processes

Password