9:00 AM - 10:40 AM
[4Rin1-10] A Sub-policy Pruning Method for Meta Learning Shared Hierarchies
Keywords:Reinforcement Learning, Hierarchical Reinforcement Learning
Hierarchical Reinforcement learning is one of the subfield of reinforcement learning.
By reusing knowledge to solve a distribution of tasks, it can quickly reach high reward.
Hierarchical Reinforcement learning could solve sparse-reward problems, it divide the policy into several sub-policies
assuming that the goal also could be divided into sub-goals and
expects each sub-policy fits to one sub-goal.
One drawback of this approach is that, in general, it is not possible know the
number of sub-goals in an unseen task. Hence, the number of sub-policies. Without setting
proper number of sub-policies, we cannot expect the hierarchical reinforcement learning
to work well.
To solve this problem, we propose a method to locate the proper number of
sub-policies by pruning excessive sub-policies.
The proposed method save resources and reduce the time the algorithm to converge.
We test the method using 2D-bandit problem and demonstrate the effectiveness.
By reusing knowledge to solve a distribution of tasks, it can quickly reach high reward.
Hierarchical Reinforcement learning could solve sparse-reward problems, it divide the policy into several sub-policies
assuming that the goal also could be divided into sub-goals and
expects each sub-policy fits to one sub-goal.
One drawback of this approach is that, in general, it is not possible know the
number of sub-goals in an unseen task. Hence, the number of sub-policies. Without setting
proper number of sub-policies, we cannot expect the hierarchical reinforcement learning
to work well.
To solve this problem, we propose a method to locate the proper number of
sub-policies by pruning excessive sub-policies.
The proposed method save resources and reduce the time the algorithm to converge.
We test the method using 2D-bandit problem and demonstrate the effectiveness.