Automation and Execution of Fetch-and-Carry Tasks Based on Multi-Modal Language Processing

Motonari Kambara

6:10 PM - 6:30 PM

[2I6-OS-4a-03] Automation and Execution of Fetch-and-Carry Tasks Based on Multi-Modal Language Processing

〇Motonari Kambara¹, Komei Sugiura¹ (1. Keio University)

Keywords:Fetch-and-carry Task, Object Grounding, Multimodal Language Processing, Vision & Language, Domestic Service Robots

The practical use of domestic support robots that can naturally communicate with users to assist with daily tasks is one promising solution for support recipients. However, the ability of domestic support robots to understand natural language instructions and properly perform daily tasks is currently insufficient. In this paper, we aim to construct methods for robots to execute instructions given in natural language for fetch-and-carry tasks. Therefore, we propose a method that can be applied to tasks where multiple target objects and target areas are input and the best candidate is predicted by expanding existing methods. Additionally, by incorporating an instruction generation model, we make it possible to conduct simulations on-the-fly. In experiments, we achieved performance that exceeded existing methods in understanding instructions and confirmed that the task was executed properly.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2I6-OS-4a] 信頼と文脈のインタラクションデザイン

[2I6-OS-4a-03] Automation and Execution of Fetch-and-Carry Tasks Based on Multi-Modal Language Processing

Password