JSAI2024

Presentation information

Organized Session

Organized Session » OS-6

[3T5-OS-6b] OS-6

Thu. May 30, 2024 3:30 PM - 4:50 PM Room T (Room 62)

オーガナイザ:寺田 和憲(岐阜大学)、今井 倫太(慶應義塾大学)、山田 誠二(国立情報学研究所)

4:30 PM - 4:50 PM

[3T5-OS-6b-04] Fetch-and-Carry Tasks by Domestic Service Robots Based on Multimodal Retrieval Models with Switching Mechanism Using Large Language Models

〇Ryosuke Korekata1, Kanta Kaneda1, Shunya Nagashima1, Yuto Imai1, Komei Sugiura1 (1. Keio University)

Keywords:Domestic Service Robot, Object Retrieval, Large Language Model, Multimodal Foundation Model, Fetch-and-Carry

In this study, we aim to develop a domestic service robot (DSR) that carries an everyday object to a piece of furniture by retrieving images of target objects and receptacles from collected images of an environment, based on an open-vocabulary instruction. We propose a multimodal model that retrieves both target objects and receptacles individually using a single model based on the switching mechanism via large language models. The experimental results show that our method outperformed baseline methods on the newly-built datasets in terms of the standard metrics. Furthermore, our method achieved task success rates of more than 80% in the physical experiments.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password