JSAI2018

Presentation information

Oral presentation

General Session » [General Session] 13. AI Application

[2D4] [General Session] 13. AI Application

Wed. Jun 6, 2018 5:20 PM - 7:00 PM Room D (4F Cattleya)

座長:森川 幸治(パナソニック株式会社)

5:40 PM - 6:00 PM

[2D4-02] Evaluation of Hybrid Reward Architecture on various learning policies and environments

〇Yutaro Fujimura1, Tomoyuki Kaneko1 (1. The University of Tokyo)

Keywords:Game, Reinforcement Learning

Deep Q-Network (DQN) was able to achieve a level comparable to the performance of a professional human player.
However, in large and complex domains (e.g. Ms. Pacman), learning can be very slow and unstable.
In Hybrid Reward Architecture (HRA), a reward function is decomposed in advance to enhance learning in such
domains, and then value functions are separately learned for decomposed reward functions.
In this paper, we made some environments that made learning more difficult to evaluate the performance of HRA.
The results indicated that HRA need more enhancements to learn environments where learning is difficult on the uniform random policy.