5:40 PM - 6:00 PM
[2M6-OS-19d-02] Sequential Entity Disentanglement for Object-Centric Learning
Keywords:Representation learning, World model, Compositionality
Perceiving the world requires meaningful disentanglement both spatially and temporally, and acquiring such representations is thought to be beneficial in prediction and planning. Recent object-centric models have improved its ability to learn distinct latent representations for each object and predict its interactions. However, models still lack to generalize well to unseen combinations of objects and dynamics. In this paper, we propose two new models that learn to disentangle time-varying latent variable to predict the interactions and time-invariant latent variable to store static object properties for each entity in the scene. In our experiments, we show that our proposed architecture disentangles scenes without supervision in a compositional manner both space-wise into each object and time-wise conditioned on actions. We also explore its benefits on object-centric planning and generalization to novel object configurations.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.