JSAI2022

Presentation information

Organized Session

Organized Session » OS-19

[2M4-OS-19b] 世界モデルと知能(2/4)

Wed. Jun 15, 2022 1:20 PM - 2:40 PM Room M (Room B-2)

オーガナイザ:鈴木 雅大(東京大学)、岩澤 有祐(東京大学)[現地]、河野 慎(東京大学)、熊谷 亘(東京大学)、森 友亮(スクウェア・エニックス)、松尾 豊(東京大学)

2:00 PM - 2:20 PM

[2M4-OS-19b-03] Coordination of model-based and model-free reinforcement learning

〇Eiji Uchibe1 (1. Advanced Telecommunications Research Institute International)

Keywords:Model-based reinforcement learning, Model-free reinforcement learning

Reinforcement learning algorithms are categorized into model-based methods, which explicitly estimate an environmental model and a reward function, and model-free methods, which directly learn a policy from real or generated experiences. So far, we have proposed the asynchronous parallel reinforcement learning algorithm for training multiple model-free and model-based reinforcement learners. The experimental results show a simple algorithm can contribute to complex algorithms' learning. However, a learner was selected stochastically according to the value function, and therefore, learning mechanisms have not been discussed. In addition, several components such as state prediction and value prediction errors were not taken into account. In this study, we compare several adaptive coordination mechanisms. For example, we evaluate the coordination based on the value functions, state prediction and value prediction errors, weighted coordination, and learning the weights. Then, we discuss learning efficiency, the ability to follow the changes in the environment, and the perspective of neuroscience.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password