A Sub-policy Pruning Method for Meta Learning Shared Hierarchies

Qing Hong

9:00 AM - 10:40 AM

[4Rin1-10] A Sub-policy Pruning Method for Meta Learning Shared Hierarchies

Qing Hong^2,1, Yusuke Tanimura^1,2, 〇Hidemoto Nakada^1,2 (1. National Institute of Advanced Institute of Science and Technology, 2. University of Tsukuba)

Keywords:Reinforcement Learning, Hierarchical Reinforcement Learning

Hierarchical Reinforcement learning is one of the subfield of reinforcement learning.
By reusing knowledge to solve a distribution of tasks, it can quickly reach high reward.
Hierarchical Reinforcement learning could solve sparse-reward problems, it divide the policy into several sub-policies
assuming that the goal also could be divided into sub-goals and
expects each sub-policy fits to one sub-goal.
One drawback of this approach is that, in general, it is not possible know the
number of sub-goals in an unseen task. Hence, the number of sub-policies. Without setting
proper number of sub-policies, we cannot expect the hierarchical reinforcement learning
to work well.
To solve this problem, we propose a method to locate the proper number of
sub-policies by pruning excessive sub-policies.
The proposed method save resources and reduce the time the algorithm to converge.
We test the method using 2D-bandit problem and demonstrate the effectiveness.

Presentation information

[4Rin1] Interactive Session 2

[4Rin1-10] A Sub-policy Pruning Method for Meta Learning Shared Hierarchies