JSAI2023

Presentation information

General Session

General Session » GS-2 Machine learning

[3D1-GS-2] Machine learning

Thu. Jun 8, 2023 9:00 AM - 10:20 AM Room D (A1)

座長:黄 勇太(Beatrust)[現地]

9:00 AM - 9:20 AM

[3D1-GS-2-01] Gamma Divergence-Based Inverse Reinforcement Learning for Sub-Optimal Trajectories

〇Daiko Kishikawa1, Sachiyo Arai1 (1. Chiba University)

Keywords:Inverse Reinforcement Learning, Sub-Optimal, Gamma Divergence

Inverse Reinforcement Learning (IRL) is a method for estimating underlying rewards from expert trajectories. IRL is used to imitate the expert through reinforcement learning in tasks where reward design is difficult or to analyze human or biological intentions. Traditional IRL methods assume that expert trajectories are perfectly optimal. Thus, sub-optimal trajectories lead to the estimation of a sub-optimal reward. There are several IRL methods for sub-optimal trajectories, although the dominant approach uses an optimality ranking of each trajectory. However, these methods are strongly affected by the accuracy of the ranking data. Therefore, we consider the suboptimal trajectory distribution to be a mixture of the optimal trajectory distribution with outliers. Then, we propose an IRL method using gamma divergence, which has the property of ignoring outliers. The proposed method can be applied to classification-based IRL methods and can be regarded as a generalization of the previously used cross-entropy-based methods. We evaluate the proposed method through computer experiments.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password