9:00 AM - 9:20 AM
[2P1-J-2-01] Toward Deep Satisficing Reinforcement Learning
Keywords: reinforcement learning, Trade-off between exploration and knowledge use, intrinsic motivation
For dealing with continuous state spaces, DQN and other algorithms have been proposed in reinforcement learning (RL). However, it is hard for DQN to explore efficiently, as it depends on random search strategies such as epsilon-greedy. Humans are known to effectively search and learn through "satisficing" instead of optimizing. Although the risk-sensitive satisificing (RS) algorithm enables satisficing in RL, it depends on the count of visiting each state, which poses a problem for continuous spaces. We propose a method for solving this problem by pseudocount and hash+auto encoder methods that enables intrinsically motivated exploration. Through two experiments, we show that RS combined with the two methods enables deep satisficing RL that searches and learns efficiently in continuous spaces.