Generation of Caption Data Using Prompt Engineering for Road Environmental Risk Analysis

Atsuya Ishikawa

6:00 PM - 6:20 PM

[1D5-GS-10-04] Generation of Caption Data Using Prompt Engineering for Road Environmental Risk Analysis

〇Atsuya Ishikawa¹, Koki Inoue², Kota Shimomura^2,3, Kazuaki Ohmori², Ryuta Shimogauchi², Reoto Wakabayashi², Ryota Mimura¹, Osamu Ito¹ (1. Honda R&D Co., Ltd., 2. Elith Inc., 3. Chubu University)

Keywords:Prompt Engineering, Large Language Model

With the spread of driver assistance systems and autonomous driving technologies, their effectiveness in reducing traffic accidents has been discussed. However, for a further reduction of accidents, it is crucial to explain traffic accident risks and analyze their mechanisms. Research on explainable multimodal networks for driving scenes has attempted methods for generating captions by considering recognizable objects using metadata. Such methods typically focus on generating captions for dynamic objects, like humans. However, to explain traffic accident risks in driving scenes, static risks caused by road signs and road structures should also be considered during caption generation. Existing large-scale multimodal networks face difficulties in generating captions that address these types of road environment risks. To tackle this challenge, we propose a caption generation method that leverages prompt engineering to include both dynamic objects and static potential risks. Additionally, experiments using the generated captions confirmed the capability of producing captions that consider both dynamic objects and static potential risks.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[1D5-GS-10] AI application: Movement

[1D5-GS-10-04] Generation of Caption Data Using Prompt Engineering for Road Environmental Risk Analysis

Password