JSAI2023

Presentation information

Organized Session

Organized Session » OS-24

[3G1-OS-24a] 日常生活知識とAI

Thu. Jun 8, 2023 9:00 AM - 10:40 AM Room G (A4)

オーガナイザ:福田 賢一郎、江上 周作、宮田 なつき、Qiu Yue、鵜飼 孝典、古崎 晃司、川村 隆浩、市瀬 龍太郎、岡田 慧

9:00 AM - 9:20 AM

[3G1-OS-24a-01] Finding Everyday Objects Using Physical-World Search Engines: a Learning–To–Rank Approach

〇Kanta Kaneda1, Motonari Kambara1, Komei Sugiura1 (1. Keio University)

Keywords:Learning to Rank, Multimodal Language Processing, Learning to Rank Physical Objects Task

In this study, we focus on the learning-to-rank physical objects task, which involves retrieving target objects from open-vocabulary user instructions in a human-in-the-loop setting. We propose MultiRankIt, which introduces the Crossmodal Noun Phrase Encoder to model the relationship between referring expressions and target bounding box, and the Crossmodal Region Feature Encoder to model the relationship between the target object and its surrounding contextual environment. Our model outperforms the baseline method in terms of mean reciprocal rank and recall@K.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password