6:30 PM - 6:50 PM
[2O6-OS-16a-04] Learning Compositional Latents and Behaviors from Object-Centric Latent Imagination
Keywords:Representation learning, World models, Object-centric learning
In reinforcement learning settings, model-based methods are a promising approach. learns. This approach learns a world model from imagination, and learn complex behaviors to solve long-horizon tasks from visual inputs only. Recent world models using transformer have improved the sample-efficiency when solving these tasks, due to the transformer's ability to capture long-term dependencies. However, world models still struggle to solve compositional tasks, as predicting object interactions and accurately tracking objects, especially for unseen configurations are common difficulties. Object-centric learning is a method to learn to disentangle a scene or a video into each objects without supervision, leading to more compositional understanding and better generalization to unseen objects and scenes. In this paper, we propose a world model that uses object-centric latents to predict dynamics. Our model aims to combine the abilities of generalization by compositionality of object-centric learning and sample-efficiency and long-horizon prediction of transformer-based world models. To validate the efficacy of our approach, we conducted experiments on OCRL benchmark dataset.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.