Presentation information

Oral presentation

General Session » [General Session] 2. Machine Learning

[1N3] [General Session] 2. Machine Learning

Tue. Jun 5, 2018 5:20 PM - 7:00 PM Room N (2F Sakurajima)

座長:松井 藤五郎(中部大学)

5:20 PM - 5:40 PM

[1N3-01] Excluding the Data with Exploration from Supervised Learning Improves Neural Fictitious Self-Play

〇Keigo Kawamura1, Jun Suzuki2,3, Yoshimasa Tsuruoka4 (1. Graduate School of Engineering, The University of Tokyo, 2. NTT Communication Science Laboratories, NTT Corporation, 3. RIKEN Center for Advanced Intelligence Project, 4. Graduate School of Information Science and Technology, The University of Tokyo)

Keywords:Imperfect information games, Reinforcement learning, Self-play, Nash equilibria

Neural fictitious self-play (NFSP) is a method for solving imperfect information games.
While methods developed in recent years such as counterfactual regret minimization or DeepStack require the state transition rules of the games, NFSP works without them.
In this paper, we propose to exclude the exploration data from the supervised learning component in NFSP and keep the probability of exploration, in order to explore without breaking the average strategy.
We show that this change significantly improves the performance of NFSP in a simplified poker game, Leduc Hold'em, and compare the results for different exploration plobabilities.