2024年度 人工知能学会全国大会(第38回)

講演情報

国際セッション

国際セッション » IS-2 Machine learning

[3Q5-IS-2b] Machine learning

2024年5月30日(木) 15:30 〜 17:10 Q会場 (402会議室)

座長:ジェプカ ラファウ(北海道大学)

16:30 〜 16:50

[3Q5-IS-2b-04] Generative Model of Policies: Exploring the Latent Space with Human Feedback

〇Raffael Bolla Di Lorenzo1, Michita Imai1 (1. Keio University)

キーワード:Reinforcement Learning, Generative Model, Human Feedback

Reinforcement learning often makes use of training a population of agents with a diversity of behaviors. A population of agents can be used to train a robust agent, that can for instance cooperate with a human partner, or simply discover many ways to solve a given task.
Generative Models of Policies are able to discover a wide range of agent policies that succeed at a given task without requiring separate policy parameters. Moreover, they can adapt to new tasks or goals simply by optimizing in the learnt latent space of policies.
In this paper, we focus on the understanding and the exploration of the latent space of policies for discovering new behaviors. More specifically, we take inspiration from StyleGAN's mapping network to better structure the latent space. We then design an exploration protocol that uses human feedback to discover new behaviors.

講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。

パスワード