Visual Instruction Tuning using Richly Decorated Traffic Volume Heatmaps

Ryoichi Kojima; Atsunori Minamikawa

[4Xin2-88] Visual Instruction Tuning using Richly Decorated Traffic Volume Heatmaps

〇Ryoichi Kojima¹, Atsunori Minamikawa¹ (1.KDDI Research, Inc.)

Keywords:AI, Multimodal

In addition to advancements in Large Language Models (LLM), there is a growing body of literature highlighting the incremental enhancement of zero-shot and few-shot performance achieved through Instruction Tuning within the domain of Large Multimodal Models (LMM). While existing research predominantly emphasizes the broad applicability of these models across diverse benchmarks, our focus is distinctly directed towards a task-specific context: predicting future traffic volume. Specifically, our study contributes findings on the effective application of Instruction Tuning to improve predictions of traffic volumes during specific temporal intervals, such as rush hours or weekends. This improvement is facilitated by transforming traffic volume heatmaps overlaid on maps into more intricate images that integrate additional information, including date and time details, latitude and longitude coordinates, and a comprehensive color scale.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4Xin2] Poster session 2

[4Xin2-88] Visual Instruction Tuning using Richly Decorated Traffic Volume Heatmaps

Password