JSAI2023

Presentation information

General Session

General Session » GS-2 Machine learning

[2D4-GS-2] Machine learning

Wed. Jun 7, 2023 1:30 PM - 3:10 PM Room D (A1)

座長:白川 真一(横浜国立大学) [現地]

1:50 PM - 2:10 PM

[2D4-GS-2-02] Using Search Results in Self-play Deep Reinforcement Learning

〇Kazuya Kagoshima1, Itsuki Noda1, Satoshi Oyama1 (1. Hokkaido University)

Keywords:Reinforcement Learning, Deep Learning, Self-play

We propose a new method for training data generation in self-play deep reinforcement learning, which are widely used in Game-AI like AlphaGoZero, AlphaZero, and so on. Generally, such self-play learning has not utilized most of search results that are generated in self-play. Currently, few researches try to make use of them. The proposed method converts the search result to training data by estimating final win/lose rewards and policy for it. The experimental investigation with various hyperparameters for the training suggests that the proposed method will help learning the policy effectively and stabilize the training.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password