JSAI2025

Presentation information

General Session

General Session » GS-1 Fundamental AI, theory

[2L5-GS-1] Fundamental AI, theory, algorithm:

Wed. May 28, 2025 3:40 PM - 5:20 PM Room L (Room 1007)

座長:中臺一博(東京科学大学)

4:20 PM - 4:40 PM

[2L5-GS-1-03] A Generative Model for Diverse Policies Based on Post-hoc Inference of Latent Intentions

〇Yu Kono1,2 (1. DeNA Co., Ltd., 2. Tokyo Denki University)

Keywords:Generative Model, Reinforcement Learning, Machine Learning

Recent advances in large-scale generative models have enabled the recognition, generation, and transformation of various modalities, such as language, speech, images, and video. While this suggests significant progress toward general AI, intelligent agents mainly aim to adapt to environments by generating actions. However, generating action concepts and policies remains underdeveloped, with only limited success through complex prompt engineering and no solid theoretical foundation. Reinforcement learning (RL) is a primary method for generating actions, but it requires extensive training to find a single optimal policy, leading to poor cost-effectiveness and a lack of diversity in the generated policies. Despite this, RL maintains strong exploratory abilities. This study proposes a hybrid model that combines the diversity of generative models with the exploration ability of RL. Inspired by the human tendency to retroactively assign meaning to actions, we introduce latent variables into the policy parameters and embed trajectories using a specialized generative model. Learning is successfully achieved through methods like contrastive learning and stationary distribution estimation.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password