JSAI2022

Presentation information

Organized Session

Organized Session » OS-19

[2M1-OS-19a] 世界モデルと知能(1/4)

Wed. Jun 15, 2022 9:00 AM - 10:40 AM Room M (Room B-2)

オーガナイザ:鈴木 雅大(東京大学)、岩澤 有祐(東京大学)[現地]、河野 慎(東京大学)、熊谷 亘(東京大学)、森 友亮(スクウェア・エニックス)、松尾 豊(東京大学)

10:00 AM - 10:20 AM

[2M1-OS-19a-04] Learning Bidirectional Translation between Description and Action with Small Paired Data

〇Minori Toyoda1, Kanata Suzuki1,2, Yoshihiko Hayashi1, Tetsuya Ogata1,3 (1. Waseda Univ., 2. Fujitsu Limited., 3. National Institute of Advanced Industrial Science and Technology)

Keywords:integration of language and action, learning from small data, representation acquisition

In this study, we achieved bidirectional translation between description and action using small paired data. The ability to mutually generate descriptions and actions is essential for robots to collaborate with humans in their daily lives. The robots need to associate real-world objects with linguistic expressions, and machine learning approaches require large-scale paired data. However, a paired dataset is costly to construct and difficult to collect. We propose a two-stage training method for the bidirectional translation that does not require complete paired data. In the proposed method, we pre-trained autoencoders for description and action with a large amount of non-paired data. Then, we fine-tuned the entire model to combine their intermediate representations using the small paired data. We experimentally evaluated our method using a paired dataset consisting of motion-captured actions and descriptions. The results showed that our method performed well even when the number of paired data to train was small.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password