Dynamic Reward Clustering

Ryota Higa

4:10 PM - 4:30 PM

[3K4-J-2-02] Dynamic Reward Clustering

〇Ryota Higa¹, Junya Kato¹ (1. NEC Corporation)

Keywords:Time series data, Reinforcement Learning , Imitation Learning, Reward Design

Real-world time series data have various patterns by the human operation. Our aim is extraction of the valuable information from the time series data with action. And we need to interpret people's policy from time series data. We propose a interpretable method for clustering the dynamic rewards from the time series data. Combining Wavelet transformation preprocessing and simple clustering methods to the human motion data and inverted pendulum simulation, our approach was successful in clustering different rewards and the interpretability of feature while maintaining the time series information.

Presentation information

[3K4-J-2] Machine learning: real world interaction

[3K4-J-2-02] Dynamic Reward Clustering