[2Win5-97] Response Design for Large Multimodal Models Leveraging Geographic Map Information
Keywords:AI, Multimodal, Geographical Information
This study proposes an approach to support the prediction of road traffic and weather in low-data environments by leveraging large multimodal models (LMMs). By building upon the commonsense knowledge embedded in pre-trained large language models (LLMs) and integrating external information such as map images, surrounding geographic data, and points of interest (POIs), this approach aims to enable diverse response generation. Instruction tuning datasets related to geographic and weather data were progressively developed, incorporating Japanese-specific geographic features and tourist resource data. Additionally, the adoption of multi-turn dialogue formats with speakers embodying diverse personas enhanced the diversity and practicality of the responses. This study suggests potential applications not only in supporting traffic and weather predictions but also in the development of smart cities and region-specific use cases.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.