Trials Reduction Method for Reinforcement Learning in Trajectory Discovery

Yusuke Kato

3:40 PM - 4:00 PM

[2A3-02] Trials Reduction Method for Reinforcement Learning in Trajectory Discovery

〇Yusuke Kato^1,2, Tomoaki Nakamura³, Takayuki Nagai³, Natsuki Yamanobe¹, Kazuyuki Nagata¹, Jun Ozawa¹ (1. Advanced Industrial Science and Technology, 2. Panasonic Corporation, 3. The University of Electro-Communications)

Keywords:Deep Reinforcement Learning, Robotics, Manipulation, Picking

In recent years, there are many researches of deep reinforcement learning to realize autonomous motion of robots. In deep reinforcement learning, a large number of trials such as thousands of times or more are required to realize sufficient performance as a learning result. However, learning in a real environment often requires assistance by people, so it is difficult to do thousands of trials. In this research, we create a learning database from efficient reinforcement learning that utilizes knowledge about tasks given by people in advance, and realize learning with a relatively small number of trials by performing mini batch learning using that database. We apply our proposed method to learning of picking task in the logistics warehouse and show the usefulness of our proposed method by comparing the results with other methods.

Presentation information

[2A3] [General Session] 11. Robot / Real World

[2A3-02] Trials Reduction Method for Reinforcement Learning in Trajectory Discovery