Reward Matrix Decomposition for Multi-Objective Inverse Reinforcement Learning

Daiko Kishikawa

11:20 AM - 11:40 AM

[4E1-GS-2-05] Reward Matrix Decomposition for Multi-Objective Inverse Reinforcement Learning

〇Daiko Kishikawa¹, Sachiyo Arai¹ (1. Chiba University)

Keywords:Multi-Objective Inverse Reinforcement Learning, Reward Matrix Decomposition

Inverse reinforcement learning (IRL), which estimates rewards from the trajectories of experts, has promising applications in imitating complex behaviors for which rewards are difficult to design, and in understanding the intentions of humans and other organisms. Many current IRL methods assume that experts follow a single objective. However, many real-world problems are multi-objective optimization problems. Specifically, we consider that experts decide their actions based on two factors: the rewards for each objective and the weights for each objective. The problem with traditional methods is that they assume that the rewards are known or do not consider the constraints on the weights. Therefore, we propose a multi-objective IRL method that simultaneously estimates the weights and rewards while satisfying the constraints on the weights. Simultaneous estimation of weights and rewards enables a more detailed analysis of the expert's intentions and the generation of new behaviors. By applying the proposed method to basic benchmark problems, we show that the proposed method enables the appropriate estimation of weights and rewards compared to traditional methods.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4E1-GS-2] Machine learning: agents

[4E1-GS-2-05] Reward Matrix Decomposition for Multi-Objective Inverse Reinforcement Learning

Password