Presentation information

General Session

General Session » [GS] J-7 Agents

[4O3-J-7] Agents: learning agents

Fri. Jun 7, 2019 2:00 PM - 3:20 PM Room O (Front-left room of 1F Exhibition hall)

Chair:Naoki Fukuda Reviewer:Yoshimasa Tawatsuji

2:00 PM - 2:20 PM

[4O3-J-7-01] Efficient Learning of Othello Utilizing the Concept of ''Undo''

〇Minori Narita1, Daiki Kimura2 (1. The University of Tokyo, 2. IBM Research AI)

Keywords:Game AI, Monte Carlo Tree Search, Deep Reinforcement Learning

Combination of Monte Carlo Tree Search (MCTS) and deep reinforcement learning represented as methods such as AlphaZero has achieved incredible performance, while it requires high computation resources and much training time. In this study, we propose a novel MCTS-based algorithm, where we introduce ``failure rate'' to facilitate efficient exploration and hence it shortens training time. This algorithm makes the agent prioritize the exploration of the states that are important to winning. Our method has outperformed AlphaZero in the first few iterations.