Developing a scalable and simple verification task of deep reinforcement learning

Ryuji Ikeda

3:40 PM - 4:00 PM

[2C5-GS-2-02] Developing a scalable and simple verification task of deep reinforcement learning

〇Ryuji Ikeda¹, Akane Minami¹, Yu Kono², Tatsuji Takahashi² (1. Graduate School of Tokyo Denki University, 2. School of Science and Engineering, Tokyo Denki University)

Keywords:Deep Reinforcement Learning, Machine Learning, Reinforcement Learning

AlphaGO, which is a go program developed by Deepmind in 2015, has a defeated one of the most proficient professional human go player. Deep reinforcement learning that AlphaGo draws upon have received a lot of attention in recent years, and is promising to be useful in many situations such as machine control and learning of digital games. Deep reinforcement learning can process a wider state-action space than conventional reinforcement learning, but the increase in required computational resources and learning time has become a problem. This problem is prominent even in basic research on deep reinforcement learning, and the current situation is that research cannot be conducted in some simple and routine way. In this study, in order to solve this problem, we propose a new simple deep reinforcement learning task. In this newly proposed "Simple task", state lines are set in a pyramid shape on hyperplanes of multiple layers, and the agent reaches the goal in constant step in a one-way space between planes without loops. This property enables to learn in a short time and allows for easy analyze. Using this "Simple task", scalable difficulty adjustment is possible to facilitate basic research, and in addition, by increasing the expandability, it has become possible to apply it to various conventional research items.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2C5-GS-2] Machine learning: reinforcement learning (2)

[2C5-GS-2-02] Developing a scalable and simple verification task of deep reinforcement learning

Password