JSAI2021

Presentation information

General Session

General Session » GS-8 Robot and real worlds

[2J4-GS-8c] ロボットと実世界:学習

Wed. Jun 9, 2021 3:20 PM - 5:00 PM Room J (GS room 5)

座長:有木 由香(ソニー(株))

4:40 PM - 5:00 PM

[2J4-GS-8c-05] Learning World Models using Skill Based Exploration Policy

〇Naruya Kondo1, Yusuke Iwasawa1, Yutaka Matsuo1 (1. The University of Tokyo)

Keywords:world model, skill discovery

Prior works show the power of modeling how the world evolves through time and locations, a.k.a., world models. The key of the world model is that the model is learned from data. However, few studies discuss how to collect good data for learning good world models; Most prior works use either purely random policy or expert policy for collecting the data.
The former may not effectively cover the data from the world of interests, and the latter is cumbersome to collect.
To this end, this paper investigates the potential to leverage the concept of "skill" into collecting good data for learning world models.
Our method train world models via the data from exploration policy based on the skill embedding, which is learned from the data simulated using the current world models.
As the skill learned in fully unsupervised manner, our methods does not rely on any data from experts, but can explore the worlds more than random policy.
Our method collect the data using the skill embedding that is learned in unsupervised manner from the data simulated using the current world models, and the learned skill is then used to collect the data for training world models.
Empirical results on the Mujoco simulator show our method can acquire better world models with fewer data than random policy.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password