Online Target Level Control in natural reinforcement learning

Haruki Ebihara

12:20 PM - 12:40 PM

[4E2-GS-2-02] Online Target Level Control in natural reinforcement learning

〇Haruki Ebihara¹, Tatsuji Takahashi¹, Yu Kono¹ (1. Tokyo Denki University)

Keywords:reinforcement learning, machine learning, decision-making

When humans engage in an unknown reinforcement learning task, they usually search quickly to achieve a certain level of performance and terminate the search when that level is achieved. This property has led to the proposal of the search method Risk-sensitive Satisficing (RS) in previous studies. We have shown that RS is more efficient in trial-and-error and performs as good as or better than conventional methods that aim for optimization. RS has been extended to learning in state transitions by combining it with Global Reference Conversion (RS+GRC), a global reference conversion method that can convert the entire rarefaction level into the rarefaction level of each state and give it to the user. However, while the current RS+GRC performs well under the condition that the optimal rarefaction level is given, the method for proactively adjusting the rarefaction level has not been discussed in depth. In this study, we propose a dynamic, stepwise goal modification algorithm for reinforcement learning based on goal attainment, aiming to deal with tasks in which the scale of the reward function and the level of task attainment are unknown.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4E2-GS-2] Machine learning

[4E2-GS-2-02] Online Target Level Control in natural reinforcement learning

Password