JSAI2025

Presentation information

Poster Session

Poster session » Poster Session

[2Win5] Poster session 2

Wed. May 28, 2025 3:30 PM - 5:30 PM Room W (Event hall D-E)

[2Win5-83] A note on motion recognition for tire inspection based on the cooperative use of object tracking models and video-LLMs

〇Kyohei Kamikawa1, Ren Togo1, Keisuke Maeda1, Takahiro Ogawa1, Miki Haseyama1 (1.Univ. of Hokkaido)

Keywords:Industrial applications, Object tracking, Large language model (LLM), Video-LLM, Dense Video captioning

There are multiple inspection steps for each part of the tire, and it is important to implement these steps accurately. Therefore, introducing action recognition technology is needed for tire inspection. In this paper, we propose a method for action recognition in tire inspection based on the cooperative use of the object tracking model and Video-LLM to support tire inspection. Dense video captioning, which provides detailed captions for each video segment, has been developed with the advent of Video-LLM. Since focusing on detailed actions in the video and performing specialized recognition has been difficult, our method enables Video-LLM to capture detailed captions for specific objects by cropping objects that are important for inspection action recognition from video based on an object tracking model. We quantitatively demonstrate the effectiveness of this method through experiments using actual tire inspection videos.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password