Challenges of Deep Reinforcement Learning in the NetHack Learning Environment: Separating the Effects of Randomness and Episode Length

Hiroshi Kiyota

[3Win5-21] Challenges of Deep Reinforcement Learning in the NetHack Learning Environment: Separating the Effects of Randomness and Episode Length

〇Hiroshi Kiyota¹ (1.ABEJA, Inc.)

Keywords:deep reinforcement learning

強化学習環境であるNetHack Learning Environment (NLE) は• プレイ毎に異なるダンジョンが生成されるランダム性や広大な状態・行動空間，長いエピソード長による遅延報酬が特徴であり，深層強化学習による攻略が困難なことが知られている．この要因として，本稿では環境のランダム性に着目した．ランダム性の影響を評価するため，学習・評価時に乱数シードを固定しランダム性の排除を試みた．結果，乱数シードを固定することで学習の速度は向上したため，少なくとも学習初期においてランダム性が学習を困難にしている要因であることが確かめられた．しかし，ランダム性を排除したとしても学習の進行は緩やかであり，ランダム性以外の要因の影響も受けていることが示唆された．

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3Win5] Poster session 3

[3Win5-21] Challenges of Deep Reinforcement Learning in the NetHack Learning Environment: Separating the Effects of Randomness and Episode Length

Password