JSAI2023

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[1O5-GS-7] Vision, speech media processing

Tue. Jun 6, 2023 5:00 PM - 7:00 PM Room O (E1+E2)

座長:真矢 滋(東芝) [現地]

6:00 PM - 6:20 PM

[1O5-GS-7-04] Analysis and Improvement of machine learning method for first-person video toward real-world application

〇Taiho Takeuchi1, Yoshifumi Seki1, Yoshinao Sato1 (1. Fairy Devices inc.)

Keywords:first-person video, computer vision

In this study, we aim to apply machine learning techniques to first-person videos and perform a detailed analysis of the experimental results using the existing method, Ego-Exo.
In recent years, machine learning research on first-person videos has become popular.
However, detailed analysis of the output of prediction models has not been published much, and knowledge for practical application is lacking.
The results of the analysis suggest two findings.
Firstly, the performance of label prediction depends on the number of samples of each label.
We found that labels with a large number of samples have high prediction performance.
Secondly, label prediction performance is high for obvious actions and objects, and low for other labels.
These findings are important for building datasets for domain-specific tasks.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password