RTD3D: A Key-Point Estimation Method for Sports Equipment by A 3D CNN Segmentation Model for Time Sequential Images

Taichi Hosoi

3:30 PM - 3:50 PM

[3T5-GS-7-01] RTD3D: A Key-Point Estimation Method for Sports Equipment by A 3D CNN Segmentation Model for Time Sequential Images

〇Taichi Hosoi¹, Bob Fisher², Hirohisa Hioki¹ (1. Kyoto University, 2. The University of Edinburgh)

[[Online]]

Keywords:Deep Learning, Image Recognition, Motion Analysis, Convolutional Neural Network, Sports Analysis

Technology for image recognition has been rapidly evolving recently, and is now applied to many fields including sports science. One long-term goal is the development of systems that analyse player motions when playing different kinds of sports. In this study, we focus on tennis and present a method (RTD3D) that tracks the position of the racket tip in frames of a video captured from a single viewpoint. RTD3D is based on deep convolutional network machine learning. It is trained to take time sequential frames from a tennis video and generate a confidence map for each frame, with a strong confidence peak at the position of the racket tip. To reduce false detections, pre-processing adaptively blurs the background, and post-processing uses a particle filter for tracking, followed by a Hampel filter to improve smoothness. Experiments on a tennis service video show that racket tip detection using RTD3D is more accurate than our previous method. We also show that the pre- and post-processing effectively improves the accuracy and smoothness of the racket tip trajectory estimates.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3T5-GS-7] Vision, speech media processing

[3T5-GS-7-01] RTD3D: A Key-Point Estimation Method for Sports Equipment by A 3D CNN Segmentation Model for Time Sequential Images

Password