JSAI2024

Presentation information

General Session

General Session » GS-2 Machine learning

[2B5-GS-2] Machine learning: Reinforcement learning

Wed. May 29, 2024 3:30 PM - 5:10 PM Room B (Concert hall)

座長:谷口 忠大(京都大学)

4:10 PM - 4:30 PM

[2B5-GS-2-03] Transformer-based World Models with Object-Centric Representations

〇Yosuke Nishimoto1, Takashi Matsubara1 (1. Osaka University )

Keywords:World Models, Reinforcement Learning , Object Centric Representations

World models mimic observed dynamics to aid learning complex behaviors. However, in situations such as playing games, where different dynamics with distinct characteristics coexist within the same screen, effective learning of world models becomes challenging. This challenge has been identified in tasks like video prediction, and recent efforts have explored solutions using object-centric representations. In this paper, we present transformer-based world models with object-centric representations combining world models with a method for video prediction using object-centric representations. This approach uses object features to model spatiotemporal relationships and predict future states accurately based on actions. The transformer receives multiple latent states from object-centric representations, rewards, and actions, flexibly adapting to all modalities across different time steps. It is expected to distinguish dynamics with distinct characteristics for each object, predicting accurate future states in response to actions. We validated the effectiveness of our method using Atari 100k benchmark's Boxing, demonstrating its utility.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password