JSAI2023

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[3T5-GS-7] Vision, speech media processing

Thu. Jun 8, 2023 3:30 PM - 5:10 PM Room T (Online)

座長:吉田 周平(NEC) [オンライン]

3:30 PM - 3:50 PM

[3T5-GS-7-01] RTD3D: A Key-Point Estimation Method for Sports Equipment by A 3D CNN Segmentation Model for Time Sequential Images

〇Taichi Hosoi1, Bob Fisher2, Hirohisa Hioki1 (1. Kyoto University, 2. The University of Edinburgh)

[[Online]]

Keywords:Deep Learning, Image Recognition, Motion Analysis, Convolutional Neural Network, Sports Analysis

Technology for image recognition has been rapidly evolving recently, and is now applied to many fields including sports science. One long-term goal is the development of systems that analyse player motions when playing different kinds of sports. In this study, we focus on tennis and present a method (RTD3D) that tracks the position of the racket tip in frames of a video captured from a single viewpoint. RTD3D is based on deep convolutional network machine learning. It is trained to take time sequential frames from a tennis video and generate a confidence map for each frame, with a strong confidence peak at the position of the racket tip. To reduce false detections, pre-processing adaptively blurs the background, and post-processing uses a particle filter for tracking, followed by a Hampel filter to improve smoothness. Experiments on a tennis service video show that racket tip detection using RTD3D is more accurate than our previous method. We also show that the pre- and post-processing effectively improves the accuracy and smoothness of the racket tip trajectory estimates.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password