JSAI2024

Presentation information

Organized Session

Organized Session » OS-17

[4P3-OS-17c] OS-17

Fri. May 31, 2024 2:00 PM - 3:20 PM Room P (Room 401)

オーガナイザ:名取 直毅(株式会社アイシン)、梶 大介(株式会社デンソー)、廣瀬 正明(株式会社デンソー)、河村 芳海(トヨタ自動車株式会社)、梶 洋隆(トヨタ自動車株式会社)、城殿 清澄(株式会社豊田中央研究所)

2:00 PM - 2:20 PM

[4P3-OS-17c-01] Topological Map Composed of Text Information

〇Hideki Deguchi1, Shun Taguchi1 (1. Toyota Central R&D Labs., Inc.)

Keywords:Vision-and-language navigation, Mapping, Large language models

In recent years, research on vision-and-language navigation has made significant progress, although it typically requires costly user instructions for each navigation step. To address this problem, we explored a method that creates a map using the user’s language path instructions. This study introduces two approaches using a large language model: one where mapping within a large language model and another where it’s done externally. We tested these methods on graph maps and language navigation instructions, revealing the capacity limits of the large language model and the success of the external mapping approach.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password