Hybird Learning Using Profit Sharing and Genetic Algorithm -Task Division Performance in MDP Environments-

Kohei Suzuki

6:20 PM - 6:40 PM

[1N3-04] Hybird Learning Using Profit Sharing and Genetic Algorithm -Task Division Performance in MDP Environments-

〇Kohei Suzuki¹, Shohei Kato^1,2 (1. Nagoya Institute of Technology, 2. Frontier Research Institute for Information Science, Nagoya Institute of Technology)

Keywords:POMDP, Reinforcement Learning, Genetic Algorithm

Reinforcement learning is generally performed in the Markov decision processes (MDP). However, there is a possibility that the agent cannot correctly observe the environment due to the perception ability of the sensor. This is called partially observable Markov decision processes (POMDP). In a POMDP environment, an agent may observe the same information at more than one state. We proposed a hybrid learning method using Profit Sharing and genetic algorithm (HPG) for this problem.However, Most of real problems can be represented in an MDP environments. In this paper, we improve HPG to adapt to MDPs environments and report the effectiveness of our method by some experiments with mazes.

Presentation information

[1N3] [General Session] 2. Machine Learning

[1N3-04] Hybird Learning Using Profit Sharing and Genetic Algorithm -Task Division Performance in MDP Environments-