JSAI2025

Presentation information

Poster Session

Poster session » Poster Session

[2Win5] Poster session 2

Wed. May 28, 2025 3:30 PM - 5:30 PM Room W (Event hall D-E)

[2Win5-97] Response Design for Large Multimodal Models Leveraging Geographic Map Information

〇Ryoichi Kojima1, Yasutaka Nishimura1, Atsunori Minamikawa1, Masato Taya1 (1.KDDI Research, Inc.)

Keywords:AI, Multimodal, Geographical Information

This study proposes an approach to support the prediction of road traffic and weather in low-data environments by leveraging large multimodal models (LMMs). By building upon the commonsense knowledge embedded in pre-trained large language models (LLMs) and integrating external information such as map images, surrounding geographic data, and points of interest (POIs), this approach aims to enable diverse response generation. Instruction tuning datasets related to geographic and weather data were progressively developed, incorporating Japanese-specific geographic features and tourist resource data. Additionally, the adoption of multi-turn dialogue formats with speakers embodying diverse personas enhanced the diversity and practicality of the responses. This study suggests potential applications not only in supporting traffic and weather predictions but also in the development of smart cities and region-specific use cases.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password