Non-Markovian environment and experience replay

Hiroyuki Ohta

9:40 AM - 10:00 AM

[2Q1-OS-27a-03] Non-Markovian environment and experience replay

〇Hiroyuki Ohta¹, Kohki Higuchi², Takahashi Tatsuji², Ishizuka Toshiaki¹ (1. National Defense Medical College, 2. Tokyo Denki University)

Keywords:Reinforcement Learning, Experience Replay

This paper explores solutions to the challenges encountered in applying reinforcement learning (RL) algorithms to non-Markovian environments, leveraging the hippocampal capacity for experience replay. Many trial-and-error iterations are necessary for such environments to train a discriminator capable of distinguishing states using contextual information. In contrast to artificial agents, animals can quickly reproduce successful behaviors even in non-Markovian tasks characterized by complex reward and state transition conditions. Recent research highlights the role of the rodent hippocampus in path planning and solving such tasks through repeated experience replays before initiating movement. We propose a novel RL model that effectively solves non-Markovian tasks by replaying previously successful action patterns before action selection and applying replay-based temporal biases to action values. This model ruminates past successful behaviors and significantly reduces the number of trial iterations. Our model presents a promising approach to tackling the challenges of RL in non-Markovian environments, offering opportunities to further interconnect neuroscience and AI research.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2Q1-OS-27a] 強化学習の新展開

[2Q1-OS-27a-03] Non-Markovian environment and experience replay

Password