JSAI2023

Presentation information

Organized Session

Organized Session » OS-21

[2G6-OS-21f] 世界モデルと知能

Wed. Jun 7, 2023 5:30 PM - 7:10 PM Room G (A4)

オーガナイザ:鈴木 雅大、岩澤 有祐、河野 慎、熊谷 亘、松嶋 達也、森 友亮、松尾 豊

6:50 PM - 7:10 PM

[2G6-OS-21f-05] Scaling Laws of Dataset Size for VideoGPT

〇Masahiro Negishi1,6, Makoto Sato2,6, Ryosuke Unno1,6, Koudai Tabata1,6, Taiju Watanabe4,6, Junnosuke Kamohara5,6, Taiga Kume3,6, Ryo Okada1,6, Yusuke Iwasawa1, Yutaka Matsuo1 (1. The University of Tokyo, 2. Nara Institute of Science and Technology, 3. Keio University, 4. Waseda University, 5. Tohoku University, 6. Matsuo Institute)

Keywords:World Models, Scaling Laws, Dataset Size

Over the past decade, deep learning has made significant strides in improving various domains by training large models with large-scale computational resources. Recent studies showed that large-scale transformer models perform well in diverse generative tasks, including language modeling and image modeling. Efficient training of such large-scale models requires a vast amount of data, and many fields are working on building large-scale datasets. However, despite the development in simulator environments such as CARLA and large-scale datasets such as RoboNet, the scaling to dataset size of the performance of world models, which try to acquire the temporal and spatial structure of environments, has yet to be sufficiently studied. Thus, this work experimentally proves the scaling law of a world model to dataset size. We use VideoGPT and a dataset generated by the CARLA simulator. We also show that the computational budget should mainly be used to scale up dataset size when the number of model parameters is on the order of 107 or larger and the computational budget is limited.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password