Object-Centric Transformer World Models and Causality-aware Policy

Yosuke Nishimoto

3:40 PM - 4:00 PM

[1B4-OS-41b-01] Object-Centric Transformer World Models and Causality-aware Policy

〇Yosuke Nishimoto¹, Takashi Matsubara² (1. Graduate School of Engineering Science, Osaka University, 2. Graduate School of Information Science and Technology, Hokkaido University)

Keywords:world models, object centric representation learning, reinforcement learning

深層強化学習エージェントはサンプル効率が著しく低く，実世界への応用が難しい．この問題に対して，世界モデルで生成した"想像"の中で学習するモデルベース強化学習手法が数多く提案されており，一定の成功を収めている．しかし，複数の物体が存在し，それらが相互作用する環境において，強化学習エージェントが世界モデルを獲得することは，依然として困難である．本研究では，TISA+ を提案する．これは，世界モデル，方策関数，価値関数の全てが物体中心表現を扱う Transformer である強化学習エージェント Transformer-based Imagination with Slot Attention (TISA) に修正を加えた手法である．世界モデルは背景を除いた各物体の状態，行動，報酬を個別に処理し，高次元の予測を効果的に行い，組み合わせ爆発を防ぐ．
方策関数および価値関数は，物体の性質に基づきトークン間の因果関係を予測することで，物体の性質をより正確に捉えた意思決定を可能する．
Safety-GymベンチマークのタスクPointButton1において，TISA+ は既存手法の性能を上回った．

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[1B4-OS-41b] OS-41

[1B4-OS-41b-01] Object-Centric Transformer World Models and Causality-aware Policy

Password