[3Yin2-10] Efficient Deep Reinforcement Learning in Large-Scale Environments Using Exploration Criteria by Critic-Attention
Keywords:Reinforcement Learning, Large-Scale Environments
Deep reinforcement learning is a method in which an agent learns optimal behavior by trial-and-error in an unknown environment and relying on the rewards it obtains, and it has outperformed humans in various gaming tasks such as Atari2600 and board games. However, the agent acts randomly without any exploration criteria until it reaches the reward. Therefore, in large and complex environments where there are few opportunities to obtain rewards, a large number of trials are required to obtain an appropriate action. In this paper, we pre-train a Critic model with a Mask-Attention mechanism and use the resulting attention map as a exploration criterion for the Policy model to enable efficient learning. Experiments using Minecraft show that the proposed method can learn actions efficiently.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.