JSAI2024

Presentation information

General Session

General Session » GS-10 AI application

[1D5-GS-10] AI application: Movement

Tue. May 28, 2024 5:00 PM - 6:20 PM Room D (Temporary room 2)

座長:冨永 登夢(日本電信電話株式会社)

6:00 PM - 6:20 PM

[1D5-GS-10-04] Generation of Caption Data Using Prompt Engineering for Road Environmental Risk Analysis

〇Atsuya Ishikawa1, Koki Inoue2, Kota Shimomura2,3, Kazuaki Ohmori2, Ryuta Shimogauchi2, Reoto Wakabayashi2, Ryota Mimura1, Osamu Ito1 (1. Honda R&D Co., Ltd., 2. Elith Inc., 3. Chubu University)

Keywords:Prompt Engineering, Large Language Model

With the spread of driver assistance systems and autonomous driving technologies, their effectiveness in reducing traffic accidents has been discussed. However, for a further reduction of accidents, it is crucial to explain traffic accident risks and analyze their mechanisms. Research on explainable multimodal networks for driving scenes has attempted methods for generating captions by considering recognizable objects using metadata. Such methods typically focus on generating captions for dynamic objects, like humans. However, to explain traffic accident risks in driving scenes, static risks caused by road signs and road structures should also be considered during caption generation. Existing large-scale multimodal networks face difficulties in generating captions that address these types of road environment risks. To tackle this challenge, we propose a caption generation method that leverages prompt engineering to include both dynamic objects and static potential risks. Additionally, experiments using the generated captions confirmed the capability of producing captions that consider both dynamic objects and static potential risks.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password